SlideShare a Scribd company logo
1 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Apache Phoenix and HBase: Past,
Present and Future of SQL over HBase
27-10-2016
Enis Soztutar (enis@hortonworks.com)
Ankit Singhal (asinghal@hortonworks.com)
2 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
About Me
Enis Soztutar
Committer and PMC member in Apache HBase, Phoenix, and Hadoop
HBase/Phoenix team @Hortonworks
Twitter @enissoz
Disclaimer: Not a SQL expert!
3 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Outline
PART I – The Past (a.k.a. All the existing stuff)
 Phoenix the basics
 Architecture
 Overview of existing Phoenix features
PART II – The Present (a.k.a. All the recent stuff)
 Look at recent releases
 Transactions
 Phoenix Query Server
 Other features
PART III – The Future (a.k.a. All the upcoming stuff)
 Calcite integration
 Phoenix – Hive
4 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part I – The Past
All the existing stuff !
5 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Obligatory Slide - Who uses Phoenix
6 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix – The Basics
• Hope everybody is familiar with HBase
• Otherwise you are in the wrong talk!
• What is wrong with pure-HBase?
• HBase is a powerful, flexible and extensible “engine”
• Too low level
• Have to write java code to do anything!
• Phoenix is relational layer over HBase
• Also described as a SQL-Skin
• Looking more and more like a generic SQL engine
• Why not Hive / Spark SQL / other SQL-over-Hadoop
• OTLP versus OLAP
• As fast as HBase, 1 ms query, 10K-1M qps
7 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Why SQL?
8 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
From CDK Global slides
https://phoenix.apache.o
rg/presentations/StrataH
adoopWorld.pdf
9 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HBase Architecture
DataNode
RegionServer 2
T:foo, region:a
T:bar, region:54
T:foo, region:t
Application
HBase client
DataNode
RegionServer 1
T:foo, region:c
T:bar, region:14
T:foo, region:d
DataNode
RegionServer 3
T:bar, region:32
T:foo, region:k
ZooKeeper
Quorum
10 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Architecture
DataNode
RegionServer 2
T:foo, region:c
T:bar, region:54
T:foo, region:t
Phoenix RPC endpoint
px
px
Application
Phoenix client / JDBC
HBase client
DataNode
RegionServer 1
T:foo, region:c
T:bar, region:14
T:foo, region:d
Phoenix RPC endpoint
px
px
DataNode
RegionServer 3
T:SYSTEM.CATALOG
T:bar, region:32
T:foo, region:k
Phoenix RPC endpoint
px
px
ZooKeeper
Quorum
11 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Goodies
SQL DataTypes
Schemas / DDL / HBase table properties
Composite Types (Composite Primary Key)
Map existing HBase tables
Write from HBase, read from Phoenix
Salting
Parallel Scan
Skip scan
Filter push down
Statistics Collection / Guideposts
12 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
DDL Example
CREATE TABLE IF NOT EXISTS METRIC_RECORD (
METRIC_NAME VARCHAR,
HOSTNAME VARCHAR,
SERVER_TIME UNSIGNED_LONG NOT NULL
METRIC_VALUE DOUBLE,
…
CONSTRAINT pk PRIMARY KEY (METRIC_NAME, HOSTNAME,
SERVER_TIME))
DATA_BLOCK_ENCODING=’FAST_DIFF', TTL=604800,
COMPRESSION=‘SNAPPY’
SPLIT ON ('a', 'k', 'm');
13 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
METRIC_NAME HOSTNAME SERVER_TIME METRIC_VALUE
Regionserver.readRequestCount cn011.hortonworks.com 1396743589 92045759
Regionserver.readRequestCount cn011.hortonworks.com 1396767589 93051916
Regionserver.readRequestCount cn011.hortonworks.com …. …
Regionserver.readRequestCount cn012. hortonworks.com 1396743589
….. … … …
Regionserver.wal.bytesWritten cn011.hortonworks.com
Regionserver.wal.bytesWritten …. …. …
SORT ORDERSORTORDER
HBASE ROW KEY OTHER COLUMNS
14 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Parallel Scan
SELECT * FROM METRIC_RECORD;
CLIENT 4-CHUNK PARALLEL 1-WAY
FULL SCAN OVER METRIC_RECORD
Region1
Region2
Region3
Region4
Client
RS3RS2
RS1
scanscanscanscan
15 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Filter push down
SELECT * FROM METRIC_RECORD WHERE
SERVER_TIME > NOW() - 7;
CLIENT 4-CHUNK PARALLEL 1-WAY
FULL SCAN OVER METRIC_RECORD
SERVER FILTER BY
SERVER_TIME > DATE
'2016-04-06 09:09:05.978’
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
scanscanscanscan
Server-side Filter
16 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Skip Scan
SELECT * FROM METRIC_RECORD
WHERE METRIC_NAME LIKE 'abc%'
AND HOSTNAME in ('host1’,
'host2');
CLIENT 1-CHUNK PARALLEL 1-WAY SKIP
SCAN ON 2 RANGES OVER METRIC_RECORD
['abc','host1'] - ['abd','host2']
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
Skip scan
17 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TopN
SELECT * FROM METRIC_RECORD
WHERE SERVER_TIME > NOW() - 7
ORDER BY HOSTNAME LIMIT 5;
CLIENT 4-CHUNK PARALLEL 4-WAY FULL
SCAN OVER METRIC_RECORD
SERVER FILTER BY SERVER_TIME > …
SERVER TOP 5 ROWS SORTED BY
[HOSTNAME]
CLIENT MERGE SORT
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
scanscanscanscan
Sort by HOSTNAME
Return only 5 ROWS
18 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Aggregation
SELECT METRIC_NAME, HOSTNAME,
AVG(METRIC_VALUE)
FROM METRIC_RECORD
WHERE SERVER_TIME > NOW() - 7
GROUP BY METRIC_NAME, HOSTNAME
ORDER BY METRIC_NAME, HOSTNAME;
CLIENT 4-CHUNK PARALLEL 1-WAY FULL
SCAN OVER METRIC_RECORD
SERVER FILTER BY SERVER_TIME > …
SERVER AGGREGATE INTO ORDERED
DISTINCT ROWS BY
[METRIC_NAME, HOSTNAME]
CLIENT MERGE SORT
Region1
Region2
Region3
Region4
Client
RS3RS2RS1
scanscanscanscan
Return only
aggregated data by
METRIC_NAME,
HOSTNAME
19 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Joins and subqueries in Phoenix
Grammar
• Inner, Left, Right, Full outer join, Cross join
• Semi-join / Anti-join
Algorithms
• Hash-join, sort-merge join
• Hash-join table is computed and pushed to each regionserver from client
Optimizations
• Predicate push-down
• PK-to-FK join optimization
• Global index with missing columns
• Correlated query rewrite
20 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Joins and subqueries in Phoenix
Phoenix can execute most of TPC-H queries!
No nested loop join
With Calcite support, more improvements soon
No statistical Guided join selection yet
Not very good at executing very big joins
• No generic YARN / Tez execution layer
• But Hive / Spark support for generic DAG execution
21 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Secondary Indexes
HBase table is a sorted map
• Everything in HBase is sorted in primary key order
• Full or partial scans in sort order is very efficient in HBase
• Sort data differently with secondary index dimensions
Two types
• Global index
• Local index
Query
• Indexes are “covered”
• Indexes are automatically selected from queries
• Only covered columns are returned from index without going back to data table
22 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Global and Local Index
Global Index
• A single instance for all table data in a different
sort order
• A different HBase table per index
• Optimized for read-heavy use cases
• Can be one edit “behind” actual primary data
• Transactional tables indices have ACID
guarantees
• Different consistency / durability for mutable /
immutable tables
Local Index
• Multiple mini-instances per region
• Uses same HBase table, different cf
• Optimized for write-heavy use cases
• Atomic commit and visibility (coming soon)
• Queries have to ask all regions for relevant data
from index
23 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part II – The Present
All the recent stuff !
24 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Release Note Highlights
4.4
• Functional Indexes
• UDFs
• Query Server
• UNION ALL
• MR Index Build
• Spark Integration
• Date built-in functions
4.5
• Client-side per-statement metrics
• SELECT without FROM
• ALTER TABLE with VIEWS
• Math and Array built-in functions
25 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Release Note Highlights
4.6
• ROW_TIMESTAMP for HBase native timestamps
• Support for correlate variable
• Support for un-nesting arrays
• Web-app for visualizing trace info (alpha)
4.7
• Transaction support
• Enhanced secondary index consistency guarantees
• Statistics improvements
• Perf improvements
26 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Row Timestamps
A pseudo-column for HBase native timestamps (versions)
Enables setting and querying cell timestamps
Perfect for time-series use cases
• Combine with FIFO / Date Tiered Compaction policies
• And HBase scan file pruning based on min-max ts for very efficient scans
CREATE TABLE METRICS_TABLE (
CREATED_DATE NOT NULL DATE,
METRIC_ID NOT NULL CHAR(15), METRIC_VALUE LONG
CONSTRAINT PK PRIMARY KEY(CREATED_DATE ROW_TIMESTAMP,
METRIC_ID)) SALT_BUCKETS = 8;
27 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Transactions
Uses Tephra
Snapshot isolation semantics
Completely optional.
• Can be enabled per-table (TRANSACTIONAL=true)
• Transactional and non-transactional tables can live side by side
Transactions see their own uncommitted data
Released in 4.7, will GA in 5.0
Optimistic Concurrency Control
• No locking for rows
• Transactions have to roll back and undo their writes in case of conflict
• Cost of conflict is higher
28 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Tephra Architecture
RegionServer 2
Tephra / HBase Client
RegionServer 1 RegionServer 3
HBase client
ZooKeeper
Quorum
Tephra Trx Manager
(active)
Tephra Trx Manager
(standby)
29 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Transaction Lifecycle
From Tephra presentation
http://www.slideshare.ne
t/alexbaranau/transactio
ns-over-hbase
30 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Query Server
Similar to HBase REST Server / Hive Server 2
Built on top of Calcite’s Avatica Server with Phoenix bindings
Embeds a Phoenix thick client inside
No client side sorting / join!
Protobuf-3.0 over HTTP protocol
Has a (thin) JDBC driver
Allows ODBC driver for Phoenix
31 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix architecture revisited (thick client)
RegionServer 2
T:foo, region:d
Phoenix RPC endpoint
px
Application
RegionServer 1
T:foo, region:d
Phoenix RPC endpoint
px
RegionServer 3
T:foo, region:d
Phoenix RPC endpoint
px
HBase client
Phoenix client / JDBC
32 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix Query Server
Phoenix Query Server (thin client)
RegionServer 2
T:foo, region:d
Phoenix RPC endpoint
px
Application
Phoenix thin client / JDBC
RegionServer 1
T:foo, region:d
Phoenix RPC endpoint
px
RegionServer 3
T:foo, region:d
Phoenix RPC endpoint
px
Phoenix client / JDBC
HBase client
Phoenix Query Server
Phoenix client / JDBC
HBase client
Phoenix Query Server
Phoenix client / JDBC
HBase client
33 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Other new features (4.8+)
Shaded client by default. No more library dependency problems!
Phoenix schema mapping to HBase namespace
• Allows using isolation and security features of HBase namespaces
• Standard SQL syntax:
CREATE SCHEMA FOO;
USE FOO;
LIMIT / OFFSET
• We already had LIMIT. Now we have OFFSET
• Together with Row-Value-Constructs, covers most of cursor use cases
34 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part III – The Future
All the upcoming stuff !
35 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Local Index
• Local Index re-implemented
• Instead of a different table, now local index data is kept within the same data
table
• Local index data goes into a different column family
• Index and data is committed together atomically without external transactions
• Bunch of stability improvements with region splits and merges
• Building local index is faster as it reads and write data to same region.
36 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Calcite Integration
Calcite is a framework for:
• Query parser
• Compiler
• Planner
• Cost based optimizer
SQL-92 compliant
Based on relational algebra
Cost based optimizer with default rules + pluggable rules per-backend
Used by Hive / Drill / Kylin / Samza, etc.
37 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Calcite Integration
38 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Phoenix - Hive integration
Hive is a very rich and generic execution engine
Uses Tez + YARN to execute arbitrary DAG
Hive integration enables big joins and other Hive features
Phoenix DDL with HiveQL
Data insert / update delete (DML) with HiveQL
Predicate pushdown, salting, partitioning, partition pruning, etc
Can use secondary indexes as well since it uses Phoenix compiler
https://issues.apache.org/jira/browse/PHOENIX-2743
39 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Future<Phoenix>
JSON support
TPC-H / Microstrategy / Tableau queries
Sqoop integration
Support Omid based transactions
Dogfooding within the Hadoop-ecosystem
• Ambari Metrics Service (AMS) uses Phoenix
• YARN will soon use HBase / Phoenix (ATS)
STRUCT type
Improvements to cost based optimization
Security and other HBase features used from Phoenix
See https://phoenix.apache.org/roadmap.html
40 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Further Reference
Even more info on https://phoenix.apache.org
 New Features: https://phoenix.apache.org/recent.html
 Roadmap: https://phoenix.apache.org/roadmap.html
Get involved in mailing lists
 user@phoenix.apache.org
 dev@phoenix.apache.org
41 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thanks
Q & A

More Related Content

What's hot

Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
enissoz
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Josh Elser
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics Manager
Sriharsha Chintalapani
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
Josh Elser
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
Bryan Bende
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
University of Moratuwa
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
DataWorks Summit/Hadoop Summit
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon
 
De-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerDe-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServer
Josh Elser
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
Bryan Bende
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
DataWorks Summit/Hadoop Summit
 
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Trieu Nguyen
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and Apex
Bryan Bende
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
HBaseCon
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
Josh Elser
 
Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
DataWorks Summit/Hadoop Summit
 

What's hot (20)

Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics Manager
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
De-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServerDe-Mystifying the Apache Phoenix QueryServer
De-Mystifying the Apache Phoenix QueryServer
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
 
Integrating NiFi and Apex
Integrating NiFi and ApexIntegrating NiFi and Apex
Integrating NiFi and Apex
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
 

Similar to Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan

Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
DataWorks Summit
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
DataWorks Summit
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
John Park
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
DataWorks Summit
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration story
Sunil Govindan
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
DataWorks Summit
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Vinod Kumar Vavilapalli
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
DataWorks Summit
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
Ankit Singhal
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
Yifeng Jiang
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Wangda Tan
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
DataWorks Summit
 
What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4
DataWorks Summit
 
Apache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryApache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft Library
Tsz-Wo (Nicholas) Sze
 

Similar to Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan (20)

Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
SoCal BigData Day
SoCal BigData DaySoCal BigData Day
SoCal BigData Day
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 
Apache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration storyApache Hadoop 3 updates with migration story
Apache Hadoop 3 updates with migration story
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and FutureHadoop Summit San Jose 2015: YARN - Past, Present and Future
Hadoop Summit San Jose 2015: YARN - Past, Present and Future
 
YARN - Past, Present, & Future
YARN - Past, Present, & FutureYARN - Past, Present, & Future
YARN - Past, Present, & Future
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Kinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-diveKinesis vs-kafka-and-kafka-deep-dive
Kinesis vs-kafka-and-kafka-deep-dive
 
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The UnionDataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
 
What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4What’s new in Apache Spark 2.3 and Spark 2.4
What’s new in Apache Spark 2.3 and Spark 2.4
 
Apache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryApache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft Library
 

Recently uploaded

Field Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdfField Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdf
hritikbui
 
Systane Global education training centre
Systane Global education training centreSystane Global education training centre
Systane Global education training centre
AkhinaRomdoni
 
SOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERING
SOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERINGSOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERING
SOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERING
PrabhuB33
 
DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...
DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...
DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...
JeevanKp7
 
Audits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdfAudits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdf
evwcarr
 
Big Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of PaymentsBig Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of Payments
RuchiRathor2
 
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop ServiceCal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Deepikakumari457585
 
PRODUCT | RESEARCH-PRESENTATION-1.1.pptx
PRODUCT | RESEARCH-PRESENTATION-1.1.pptxPRODUCT | RESEARCH-PRESENTATION-1.1.pptx
PRODUCT | RESEARCH-PRESENTATION-1.1.pptx
amazenolmedojeruel
 
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
weiwchu
 
Vrinda store data analysis project using Excel
Vrinda store data analysis project using ExcelVrinda store data analysis project using Excel
Vrinda store data analysis project using Excel
SantuJana12
 
Full Disclosure Board Policy.docx BRGY LICUMA
Full  Disclosure Board Policy.docx BRGY LICUMAFull  Disclosure Board Policy.docx BRGY LICUMA
Full Disclosure Board Policy.docx BRGY LICUMA
brgylicumaormoccity
 
Getting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdfGetting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdf
Riya Sen
 
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
rightmanforbloodline
 
CT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptxCT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptx
RejoJohn2
 
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Alireza Kamrani
 
Selcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdfSelcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdf
SelcukTOPAL2
 
Accounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-RegulationsAccounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-Regulations
DALubis
 
Data Analytics for Decision Making By District 11 Solutions
Data Analytics for Decision Making By District 11 SolutionsData Analytics for Decision Making By District 11 Solutions
Data Analytics for Decision Making By District 11 Solutions
District 11 Solutions
 
393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf
393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf
393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf
Ladislau5
 
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
deepikakumaridk25
 

Recently uploaded (20)

Field Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdfField Diary and lab record, Importance.pdf
Field Diary and lab record, Importance.pdf
 
Systane Global education training centre
Systane Global education training centreSystane Global education training centre
Systane Global education training centre
 
SOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERING
SOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERINGSOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERING
SOFTWARE ENGINEERING-UNIT-1SOFTWARE ENGINEERING
 
DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...
DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...
DESIGN AND DEVELOPMENT OF AUTO OXYGEN CONCENTRATOR WITH SOS ALERT FOR HIKING ...
 
Audits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdfAudits Of Complaints Against the PPD Report_2022.pdf
Audits Of Complaints Against the PPD Report_2022.pdf
 
Big Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of PaymentsBig Data and Analytics Shaping the future of Payments
Big Data and Analytics Shaping the future of Payments
 
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop ServiceCal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
Cal Girls Hotel Safari Jaipur | | Girls Call Free Drop Service
 
PRODUCT | RESEARCH-PRESENTATION-1.1.pptx
PRODUCT | RESEARCH-PRESENTATION-1.1.pptxPRODUCT | RESEARCH-PRESENTATION-1.1.pptx
PRODUCT | RESEARCH-PRESENTATION-1.1.pptx
 
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
Harnessing Wild and Untamed (Publicly Available) Data for the Cost efficient ...
 
Vrinda store data analysis project using Excel
Vrinda store data analysis project using ExcelVrinda store data analysis project using Excel
Vrinda store data analysis project using Excel
 
Full Disclosure Board Policy.docx BRGY LICUMA
Full  Disclosure Board Policy.docx BRGY LICUMAFull  Disclosure Board Policy.docx BRGY LICUMA
Full Disclosure Board Policy.docx BRGY LICUMA
 
Getting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdfGetting Started with Interactive Brokers API and Python.pdf
Getting Started with Interactive Brokers API and Python.pdf
 
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
Solution Manual for First Course in Abstract Algebra A, 8th Edition by John B...
 
CT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptxCT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptx
 
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
Dataguard Switchover Best Practices using DGMGRL (Dataguard Broker Command Line)
 
Selcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdfSelcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdf
 
Accounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-RegulationsAccounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-Regulations
 
Data Analytics for Decision Making By District 11 Solutions
Data Analytics for Decision Making By District 11 SolutionsData Analytics for Decision Making By District 11 Solutions
Data Analytics for Decision Making By District 11 Solutions
 
393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf
393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf
393947940-The-Dell-EMC-PowerMax-Family-Overview.pdf
 
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
Cal Girls The Lalit Jaipur 8445551418 Khusi Top Class Girls Call Jaipur Avail...
 

Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Apache Phoenix and HBase: Past, Present and Future of SQL over HBase 27-10-2016 Enis Soztutar (enis@hortonworks.com) Ankit Singhal (asinghal@hortonworks.com)
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved About Me Enis Soztutar Committer and PMC member in Apache HBase, Phoenix, and Hadoop HBase/Phoenix team @Hortonworks Twitter @enissoz Disclaimer: Not a SQL expert!
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Outline PART I – The Past (a.k.a. All the existing stuff)  Phoenix the basics  Architecture  Overview of existing Phoenix features PART II – The Present (a.k.a. All the recent stuff)  Look at recent releases  Transactions  Phoenix Query Server  Other features PART III – The Future (a.k.a. All the upcoming stuff)  Calcite integration  Phoenix – Hive
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Part I – The Past All the existing stuff !
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Obligatory Slide - Who uses Phoenix
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix – The Basics • Hope everybody is familiar with HBase • Otherwise you are in the wrong talk! • What is wrong with pure-HBase? • HBase is a powerful, flexible and extensible “engine” • Too low level • Have to write java code to do anything! • Phoenix is relational layer over HBase • Also described as a SQL-Skin • Looking more and more like a generic SQL engine • Why not Hive / Spark SQL / other SQL-over-Hadoop • OTLP versus OLAP • As fast as HBase, 1 ms query, 10K-1M qps
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Why SQL?
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved From CDK Global slides https://phoenix.apache.o rg/presentations/StrataH adoopWorld.pdf
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HBase Architecture DataNode RegionServer 2 T:foo, region:a T:bar, region:54 T:foo, region:t Application HBase client DataNode RegionServer 1 T:foo, region:c T:bar, region:14 T:foo, region:d DataNode RegionServer 3 T:bar, region:32 T:foo, region:k ZooKeeper Quorum
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix Architecture DataNode RegionServer 2 T:foo, region:c T:bar, region:54 T:foo, region:t Phoenix RPC endpoint px px Application Phoenix client / JDBC HBase client DataNode RegionServer 1 T:foo, region:c T:bar, region:14 T:foo, region:d Phoenix RPC endpoint px px DataNode RegionServer 3 T:SYSTEM.CATALOG T:bar, region:32 T:foo, region:k Phoenix RPC endpoint px px ZooKeeper Quorum
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix Goodies SQL DataTypes Schemas / DDL / HBase table properties Composite Types (Composite Primary Key) Map existing HBase tables Write from HBase, read from Phoenix Salting Parallel Scan Skip scan Filter push down Statistics Collection / Guideposts
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved DDL Example CREATE TABLE IF NOT EXISTS METRIC_RECORD ( METRIC_NAME VARCHAR, HOSTNAME VARCHAR, SERVER_TIME UNSIGNED_LONG NOT NULL METRIC_VALUE DOUBLE, … CONSTRAINT pk PRIMARY KEY (METRIC_NAME, HOSTNAME, SERVER_TIME)) DATA_BLOCK_ENCODING=’FAST_DIFF', TTL=604800, COMPRESSION=‘SNAPPY’ SPLIT ON ('a', 'k', 'm');
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved METRIC_NAME HOSTNAME SERVER_TIME METRIC_VALUE Regionserver.readRequestCount cn011.hortonworks.com 1396743589 92045759 Regionserver.readRequestCount cn011.hortonworks.com 1396767589 93051916 Regionserver.readRequestCount cn011.hortonworks.com …. … Regionserver.readRequestCount cn012. hortonworks.com 1396743589 ….. … … … Regionserver.wal.bytesWritten cn011.hortonworks.com Regionserver.wal.bytesWritten …. …. … SORT ORDERSORTORDER HBASE ROW KEY OTHER COLUMNS
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Parallel Scan SELECT * FROM METRIC_RECORD; CLIENT 4-CHUNK PARALLEL 1-WAY FULL SCAN OVER METRIC_RECORD Region1 Region2 Region3 Region4 Client RS3RS2 RS1 scanscanscanscan
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Filter push down SELECT * FROM METRIC_RECORD WHERE SERVER_TIME > NOW() - 7; CLIENT 4-CHUNK PARALLEL 1-WAY FULL SCAN OVER METRIC_RECORD SERVER FILTER BY SERVER_TIME > DATE '2016-04-06 09:09:05.978’ Region1 Region2 Region3 Region4 Client RS3RS2RS1 scanscanscanscan Server-side Filter
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Skip Scan SELECT * FROM METRIC_RECORD WHERE METRIC_NAME LIKE 'abc%' AND HOSTNAME in ('host1’, 'host2'); CLIENT 1-CHUNK PARALLEL 1-WAY SKIP SCAN ON 2 RANGES OVER METRIC_RECORD ['abc','host1'] - ['abd','host2'] Region1 Region2 Region3 Region4 Client RS3RS2RS1 Skip scan
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TopN SELECT * FROM METRIC_RECORD WHERE SERVER_TIME > NOW() - 7 ORDER BY HOSTNAME LIMIT 5; CLIENT 4-CHUNK PARALLEL 4-WAY FULL SCAN OVER METRIC_RECORD SERVER FILTER BY SERVER_TIME > … SERVER TOP 5 ROWS SORTED BY [HOSTNAME] CLIENT MERGE SORT Region1 Region2 Region3 Region4 Client RS3RS2RS1 scanscanscanscan Sort by HOSTNAME Return only 5 ROWS
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Aggregation SELECT METRIC_NAME, HOSTNAME, AVG(METRIC_VALUE) FROM METRIC_RECORD WHERE SERVER_TIME > NOW() - 7 GROUP BY METRIC_NAME, HOSTNAME ORDER BY METRIC_NAME, HOSTNAME; CLIENT 4-CHUNK PARALLEL 1-WAY FULL SCAN OVER METRIC_RECORD SERVER FILTER BY SERVER_TIME > … SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [METRIC_NAME, HOSTNAME] CLIENT MERGE SORT Region1 Region2 Region3 Region4 Client RS3RS2RS1 scanscanscanscan Return only aggregated data by METRIC_NAME, HOSTNAME
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Joins and subqueries in Phoenix Grammar • Inner, Left, Right, Full outer join, Cross join • Semi-join / Anti-join Algorithms • Hash-join, sort-merge join • Hash-join table is computed and pushed to each regionserver from client Optimizations • Predicate push-down • PK-to-FK join optimization • Global index with missing columns • Correlated query rewrite
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Joins and subqueries in Phoenix Phoenix can execute most of TPC-H queries! No nested loop join With Calcite support, more improvements soon No statistical Guided join selection yet Not very good at executing very big joins • No generic YARN / Tez execution layer • But Hive / Spark support for generic DAG execution
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Secondary Indexes HBase table is a sorted map • Everything in HBase is sorted in primary key order • Full or partial scans in sort order is very efficient in HBase • Sort data differently with secondary index dimensions Two types • Global index • Local index Query • Indexes are “covered” • Indexes are automatically selected from queries • Only covered columns are returned from index without going back to data table
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Global and Local Index Global Index • A single instance for all table data in a different sort order • A different HBase table per index • Optimized for read-heavy use cases • Can be one edit “behind” actual primary data • Transactional tables indices have ACID guarantees • Different consistency / durability for mutable / immutable tables Local Index • Multiple mini-instances per region • Uses same HBase table, different cf • Optimized for write-heavy use cases • Atomic commit and visibility (coming soon) • Queries have to ask all regions for relevant data from index
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Part II – The Present All the recent stuff !
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Release Note Highlights 4.4 • Functional Indexes • UDFs • Query Server • UNION ALL • MR Index Build • Spark Integration • Date built-in functions 4.5 • Client-side per-statement metrics • SELECT without FROM • ALTER TABLE with VIEWS • Math and Array built-in functions
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Release Note Highlights 4.6 • ROW_TIMESTAMP for HBase native timestamps • Support for correlate variable • Support for un-nesting arrays • Web-app for visualizing trace info (alpha) 4.7 • Transaction support • Enhanced secondary index consistency guarantees • Statistics improvements • Perf improvements
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Row Timestamps A pseudo-column for HBase native timestamps (versions) Enables setting and querying cell timestamps Perfect for time-series use cases • Combine with FIFO / Date Tiered Compaction policies • And HBase scan file pruning based on min-max ts for very efficient scans CREATE TABLE METRICS_TABLE ( CREATED_DATE NOT NULL DATE, METRIC_ID NOT NULL CHAR(15), METRIC_VALUE LONG CONSTRAINT PK PRIMARY KEY(CREATED_DATE ROW_TIMESTAMP, METRIC_ID)) SALT_BUCKETS = 8;
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Transactions Uses Tephra Snapshot isolation semantics Completely optional. • Can be enabled per-table (TRANSACTIONAL=true) • Transactional and non-transactional tables can live side by side Transactions see their own uncommitted data Released in 4.7, will GA in 5.0 Optimistic Concurrency Control • No locking for rows • Transactions have to roll back and undo their writes in case of conflict • Cost of conflict is higher
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Tephra Architecture RegionServer 2 Tephra / HBase Client RegionServer 1 RegionServer 3 HBase client ZooKeeper Quorum Tephra Trx Manager (active) Tephra Trx Manager (standby)
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Transaction Lifecycle From Tephra presentation http://www.slideshare.ne t/alexbaranau/transactio ns-over-hbase
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix Query Server Similar to HBase REST Server / Hive Server 2 Built on top of Calcite’s Avatica Server with Phoenix bindings Embeds a Phoenix thick client inside No client side sorting / join! Protobuf-3.0 over HTTP protocol Has a (thin) JDBC driver Allows ODBC driver for Phoenix
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix architecture revisited (thick client) RegionServer 2 T:foo, region:d Phoenix RPC endpoint px Application RegionServer 1 T:foo, region:d Phoenix RPC endpoint px RegionServer 3 T:foo, region:d Phoenix RPC endpoint px HBase client Phoenix client / JDBC
  • 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix Query Server Phoenix Query Server (thin client) RegionServer 2 T:foo, region:d Phoenix RPC endpoint px Application Phoenix thin client / JDBC RegionServer 1 T:foo, region:d Phoenix RPC endpoint px RegionServer 3 T:foo, region:d Phoenix RPC endpoint px Phoenix client / JDBC HBase client Phoenix Query Server Phoenix client / JDBC HBase client Phoenix Query Server Phoenix client / JDBC HBase client
  • 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Other new features (4.8+) Shaded client by default. No more library dependency problems! Phoenix schema mapping to HBase namespace • Allows using isolation and security features of HBase namespaces • Standard SQL syntax: CREATE SCHEMA FOO; USE FOO; LIMIT / OFFSET • We already had LIMIT. Now we have OFFSET • Together with Row-Value-Constructs, covers most of cursor use cases
  • 34. 34 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Part III – The Future All the upcoming stuff !
  • 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Local Index • Local Index re-implemented • Instead of a different table, now local index data is kept within the same data table • Local index data goes into a different column family • Index and data is committed together atomically without external transactions • Bunch of stability improvements with region splits and merges • Building local index is faster as it reads and write data to same region.
  • 36. 36 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Calcite Integration Calcite is a framework for: • Query parser • Compiler • Planner • Cost based optimizer SQL-92 compliant Based on relational algebra Cost based optimizer with default rules + pluggable rules per-backend Used by Hive / Drill / Kylin / Samza, etc.
  • 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Calcite Integration
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Phoenix - Hive integration Hive is a very rich and generic execution engine Uses Tez + YARN to execute arbitrary DAG Hive integration enables big joins and other Hive features Phoenix DDL with HiveQL Data insert / update delete (DML) with HiveQL Predicate pushdown, salting, partitioning, partition pruning, etc Can use secondary indexes as well since it uses Phoenix compiler https://issues.apache.org/jira/browse/PHOENIX-2743
  • 39. 39 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Future<Phoenix> JSON support TPC-H / Microstrategy / Tableau queries Sqoop integration Support Omid based transactions Dogfooding within the Hadoop-ecosystem • Ambari Metrics Service (AMS) uses Phoenix • YARN will soon use HBase / Phoenix (ATS) STRUCT type Improvements to cost based optimization Security and other HBase features used from Phoenix See https://phoenix.apache.org/roadmap.html
  • 40. 40 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Further Reference Even more info on https://phoenix.apache.org  New Features: https://phoenix.apache.org/recent.html  Roadmap: https://phoenix.apache.org/roadmap.html Get involved in mailing lists  user@phoenix.apache.org  dev@phoenix.apache.org
  • 41. 41 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Thanks Q & A

Editor's Notes

  1. - What is hbase? - What is it good at? - How do you use it in my applications? Context, first principals
  2. Understand the world it lives in and it’s building blocks
  3. Understand the world it lives in and it’s building blocks
  4. Understand the world it lives in and it’s building blocks