SlideShare a Scribd company logo
Local Secondary Indexes in
Apache Phoenix
Rajeshbabu Chintaguntla
PhoenixCon 2017
2 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
Local Indexes Introduction
Local indexes design and data model
Local index writes and reads
Performance Results
Helpful Tips or recommendations
3 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Secondary indexes in Phoenix
 Primary Key columns in a phoenix table forms HBase row key which acts as a
primary index so filtering by primary key columns become point or range
scans to the table.
 Filtering on non primary key column converts query into full table scans and
consume lot time and resources.
 With secondary indexes, we can create alternative access paths to convert
queries into point lookups or range scans.
 Phoenix supports two kinds of indexes GLOBAL and LOCAL.
 Phoenix supports Functional indexes as well.
4 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Local Secondary Indexes - Introduction
 Local secondary index is LOCAL in the sense that a REGION in a table is
considered as a unit and create and maintain index of it’s data.
 The local index data is stored and maintained in the shadow column
family(ies) in the same table.
 So the index is 100% co-reside in the same server serving the actual data.
 Faster index building.
 Syntax:

Recommended for you

Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)

Hive tables are an integral part of the big data ecosystem, but the simple directory-based design that made them ubiquitous is increasingly problematic. Netflix uses tables backed by S3 that, like other object stores, don’t fit this directory-based model: listings are much slower, renames are not atomic, and results are eventually consistent. Even tables in HDFS are problematic at scale, and reliable query behavior requires readers to acquire locks and wait. Owen O’Malley and Ryan Blue offer an overview of Iceberg, a new open source project that defines a new table layout addresses the challenges of current Hive tables, with properties specifically designed for cloud object stores, such as S3. Iceberg is an Apache-licensed open source project. It specifies the portable table format and standardizes many important features, including: * All reads use snapshot isolation without locking. * No directory listings are required for query planning. * Files can be added, removed, or replaced atomically. * Full schema evolution supports changes in the table over time. * Partitioning evolution enables changes to the physical layout without breaking existing queries. * Data files are stored as Avro, ORC, or Parquet. * Support for Spark, Pig, and Presto.

 
β€’by Ryan Blue
big datahadoopspark
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera

"While running a simple key/value based solution on HBase usually requires an equally simple schema, it is less trivial to operate a different application that has to insert thousands of records per second. This talk will address the architectural challenges when designing for either read or write performance imposed by HBase. It will include examples of real world use-cases and how they can be implemented on top of HBase, using schemas that optimize for the given access patterns. "

clouderalars georgehbase
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree

1. Log structured merge trees store data in multiple levels with different storage speeds and costs, requiring data to periodically merge across levels. 2. This structure allows fast writes by storing new data in faster levels before merging to slower levels, and efficient reads by querying multiple levels and merging results. 3. The merging process involves loading, sorting, and rewriting levels to consolidate and propagate deletions and updates between levels.

database
5 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Local Secondary Index - Introduction
Order Id Customer ID Item ID Date
100 11 1111 06/10/2017
101 23 1231 06/01/2017
102 11 1332 05/31/2017
103 34 3221 06/01/2017
Region[100
,104)
Region[104
,107)
REGION
START KEY
IDX ID DATE Order ID
100 1 05/31/2017 102
100 1 06/01/2017 101
100 1 06/01/2017 103
100 1 06/10/2017 100
104 55 1343 05/28/2017
105 11 2312 06/01/2017
106 29 1234 05/15/2017
104 1 05/15/2017 106
104 1 05/28/2017 104
104 1 06/01/2017 105
CREATE TABLE IF NOT EXISTS ORDERS(
ORDER_ID LONG NOT NULL PRIMARY KEY,
CUSTOMER_ID LONG NOT NULL,
ITEM_ID INTEGER NOT NULL,
DATE DATE NOT NULL);
CREATE LOCAL INDEX IDX ON ORDERS(DATE)
Index of
Region[100,
104)
Index of Region[104,107)
BASE TABLE
DATA – ORDER
ID IS PRIMARY
KEY INDEX ROW KEY
6 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Table
Region1
0
L#
0
STATS
CREATE TABLE IF NOT EXISTS WEB_STAT (
HOST CHAR(2) NOT NULL,
DOMAIN VARCHAR NOT NULL,
FEATURE VARCHAR NOT NULL,
DATE DATE NOT NULL,
STATS.ACTIVE_VISITOR INTEGER
CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN));
Region2
0
L#
0
STATS
2) CREATE LOCAL INDEX IDX2 ON
WEB_STAT(STATS.ACTIVE_VISITOR) INCLUDE(DATE)
Table
Region1
0
STATS
Region2
0
L#
0
STATS
3) CREATE LOCAL INDEX IDX3 ON WEB_STAT(DATE)
INCLUDE(STATS.ACTIVE_VISITOR)
L#STATS
L#
0
L#STATS
Data Model
Shadow column
families to store
the index data
1) CREATE LOCAL INDEX IDX ON WEB_STAT(DATE)
7 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Data Model
REGION
START KEY
SALT NUMBER
(Empty for
non salt table)
INDEX ID
TENANT_ID
(Empty for
non multi
tenant table)
INDEXED COLUMN
VALUE[S]
PRIMARY KEY COLUMN
VALUE[S]
Local index row key format
οƒ˜ REGION START KEY: Start key of data region. For first region it’s empty byte array of region
end key length. This helps to index region wise data.
οƒ˜ SALT NUMBER: A byte value represents a salt bucket number calculated for index row key.
οƒ˜ INDEX ID: A short number represents the local index. This helps to store each index data
together.
οƒ˜ TENANT_ID: Tenant column value of the row key. It’s empty for if a table is not multi-tenant
8 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Write path
Region Server
Region
CLIENT
1.Write
request
prepare index updates
Data cf Index cf
2.batch call
Mem
Store
Me
mSto
re
Index
updates
Data updates
4.Merge data and
index updates
5.Write to
MemStores
WAL
6.Write to WAL
100% ATOMIC
and CONSISTENT
local index
updates with
data updates

Recommended for you

InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...

The document discusses updates to InfluxDB IOx, a new columnar time series database. It covers changes and improvements to the API, CLI, query capabilities, and path to open sourcing builds. Key points include moving to gRPC for management, adding PostgreSQL string functions to queries, optimizing functions for scalar values and columns, and monitoring internal systems as the first step to releasing open source builds.

 
β€’by InfluxData
influxdbinfluxdatatime series database
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets

Data Orchestration Summit www.alluxio.io/data-orchestration-summit-2019 November 7, 2019 Apache Iceberg - A Table Format for Hige Analytic Datasets Speaker: Ryan Blue, Netflix For more Alluxio events: https://www.alluxio.io/events/

netflixalluxiodata warehouse
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem

The document discusses Apache NiFi and its role in the Hadoop ecosystem. It provides an overview of NiFi, describes how it can be used to integrate with Hadoop components like HDFS, HBase, and Kafka. It also discusses how NiFi supports stream processing integrations and outlines some use cases. The document concludes by discussing future work, including improving NiFi's high availability, multi-tenancy, and expanding its ecosystem integrations.

9 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Regionserver
Region [β€˜β€™,F)
Region [F,L)
Client
0 L#0
Region [L,R)
Region [R,’’)
Regionserver
Read Path
0 L#0
0 L#0
0 L#0
SELECT COUNT(*) FROM T WHERE INDEXED_COL=β€˜findme’
2
1
0
5
10 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Read Path
SELECT INDEX_COL, NON_INDEX_COL FROM T WHERE INDEX_COL=β€˜findme’
Joining back missing columns from data table
Region
CLIENT
1.SCAN,L#0,FILTER
Index cf Data cf
Mem
Store
Me
mSto
re
2.Apply filter
on index col
3.Get non
index cols on
matching rows
4.Merge with
index cols
5.Return
combined
results to client
6. Results
11 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Region Splits and Merges
 Since the indexes also stored in the same table, splits and merges taken care
by HBase automatically.
 We have special mechanism to separate HFile into child regions after split.
We scan through each key value find the data row key from it and write to
corresponding child region
12 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Performance Results
 4 node cluster
 Tested with 5 local indexes on the base table of 25 columns with 10 regions.
 Ingested 50M rows.
 3x faster upsert time comparing to global indexes
 5x less network RX/TX utilizations during write comparing to global indexes
 Similar read performance comparing to global indexes with queries like aggregations, group
by, limit etc.

Recommended for you

Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake

Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies the streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs. In this talk, we will cover: * What data quality problems Delta helps address * How to convert your existing application to Delta Lake * How the Delta Lake transaction protocol works internally * The Delta Lake roadmap for the next few releases * How to get involved!

 
β€’by Databricks
delta lakeapache sparkstructured streaming
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi

Flink Forward San Francisco 2022. With a real-time processing engine like Flink and a transactional storage layer like Hudi, it has never been easier to build end-to-end low-latency data platforms connecting sources like Kafka to data lake storage. Come learn how to blend Lakehouse architectural patterns with real-time processing pipelines with Flink and Hudi. We will dive deep on how Flink can leverage the newest features of Hudi like multi-modal indexing that dramatically improves query and write performance, data skipping that reduces the query latency by 10x for large datasets, and many more innovations unique to Flink and Hudi. by Ethan Guo & Kyle Weller

stream processingbig dataapache flink
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

When interacting with analytics dashboards in order to achieve a smooth user experience, two major key requirements are sub-second response time and data freshness. Cluster computing frameworks such as Hadoop or Hive/Hbase work well for storing large volumes of data, although they are not optimized for ingesting streaming data and making it available for queries in realtime. Also, long query latencies make these systems sub-optimal choices for powering interactive dashboards and BI use-cases. In this talk we will present Druid as a complementary solution to existing hadoop based technologies. Druid is an open-source analytics data store, designed from scratch, for OLAP and business intelligence queries over massive data streams. It provides low latency realtime data ingestion and fast sub-second adhoc flexible data exploration queries. Many large companies are switching to Druid for analytics, and we will cover how druid is able to handle massive data streams and why it is a good fit for BI use cases. Agenda - 1) Introduction and Ideal Use cases for Druid 2) Data Architecture 3) Streaming Ingestion with Kafka 4) Demo using Druid, Kafka and Superset. 5) Recent Improvements in Druid moving from lambda architecture to Exactly once Ingestion 6) Future Work

dataworks summitdataworks summit 2017hadoop summit
13 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Performance results
Write performance
14 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Performance results
Network Tx/Rx during write
15 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Performance results
Network Tx/Rx during write
16 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Performance results
Network Tx/Rx during write

Recommended for you

The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...

Essentially every successful analytical DBMS in the market today makes use of column-oriented data structures. In the Hadoop ecosystem, Apache Parquet (and Apache ORC) provide similar advantages in terms of processing and storage efficiency. Apache Arrow is the in-memory counterpart to these formats and has been been embraced by over a dozen open source projects as the de facto standard for in-memory processing. In this session the PMC Chair for Apache Arrow and the PMC Chair for Apache Parquet discuss the future of column-oriented processing.

hivehadoopimpala
Parquet overview
Parquet overviewParquet overview
Parquet overview

Parquet is a column-oriented storage format for Hadoop that supports efficient compression and encoding techniques. It uses a row group structure to store data in columns in a compressed and encoded column chunk format. The schema and metadata are stored in the file footer to allow for efficient reads and scans of selected columns. The format is designed to be extensible through pluggable components for schema conversion, record materialization, and encodings.

hadoopparquetcolumnar
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...

InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optimized In-Memory Query Execution Engine

 
β€’by InfluxData
influxdbinfluxdatatime series database
17 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Helpful Tips
 Mutable vs Immutable rows table?
– Writes are much more faster with local indexes on immutable rows table than mutable.
So if the row written once and never updated then better to create table with
IMMUTABLE_ROWS property.
 Online vs Offline index population?
– When a table with pre-existing data then index population time may vary depending on
the data size.
– Usually index population happen at server by reading data table and writing index to the
same table. It works very fast normally. But if the data size is too big then better to use
ASYNC population by using IndexTool.
 Covered index vs non covered index?
– When a query contains the non indexed columns to access then Phoenix joins the
missing columns(in the index) from data table itself by using get calls. If the matching
number of rows are high better to create covered index to avoid get calls.
18 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
Q & A?
rajeshbabu@apache.org
@rajeshhcu32

More Related Content

What's hot

Apache Spark 3 Dynamic Partition Pruning
Apache Spark 3 Dynamic Partition PruningApache Spark 3 Dynamic Partition Pruning
Apache Spark 3 Dynamic Partition Pruning
Aparup Chatterjee
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
DataWorks Summit
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
Biju Nair
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Cloudera, Inc.
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
University of California, Santa Cruz
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxData
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
DataWorks Summit
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
Dremio Corporation
 
Parquet overview
Parquet overviewParquet overview
Parquet overview
Julien Le Dem
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxData
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
Andrew Lamb
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
Cloudera, Inc.
 
Overview SQL Server 2019
Overview SQL Server 2019Overview SQL Server 2019
Overview SQL Server 2019
Juan Fabian
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
DataWorks Summit
 

What's hot (20)

Apache Spark 3 Dynamic Partition Pruning
Apache Spark 3 Dynamic Partition PruningApache Spark 3 Dynamic Partition Pruning
Apache Spark 3 Dynamic Partition Pruning
 
Optimizing Hive Queries
Optimizing Hive QueriesOptimizing Hive Queries
Optimizing Hive Queries
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
 
Parquet overview
Parquet overviewParquet overview
Parquet overview
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
Overview SQL Server 2019
Overview SQL Server 2019Overview SQL Server 2019
Overview SQL Server 2019
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 

Similar to Local Secondary Indexes in Apache Phoenix

Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Josh Elser
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit
 
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, JapanApache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Ankit Singhal
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
enissoz
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
DataWorks Summit
 
Lightweight ETL pipelines with mara (PyData Berlin September Meetup)
Lightweight ETL pipelines with mara (PyData Berlin September Meetup)Lightweight ETL pipelines with mara (PyData Berlin September Meetup)
Lightweight ETL pipelines with mara (PyData Berlin September Meetup)
Martin Loetzsch
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
Thejas Nair
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
Abdelkrim Hadjidj
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
DataWorks Summit
 
Hbase mhug 2015
Hbase mhug 2015Hbase mhug 2015
Hbase mhug 2015
Joseph Niemiec
 
Ijebea14 228
Ijebea14 228Ijebea14 228
Ijebea14 228
Iasir Journals
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
Michael Stack
 
HBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region ReplicasHBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region Replicas
DataWorks Summit
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
Dave Stokes
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
DataWorks Summit/Hadoop Summit
 
IRJET- Rest API for E-Commerce Site
IRJET- Rest API for E-Commerce SiteIRJET- Rest API for E-Commerce Site
IRJET- Rest API for E-Commerce Site
IRJET Journal
 
War of the Indices- SQL vs. Oracle
War of the Indices-  SQL vs. OracleWar of the Indices-  SQL vs. Oracle
War of the Indices- SQL vs. Oracle
Kellyn Pot'Vin-Gorman
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
Abhinav Tyagi
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
Abhinav Tyagi
 
Sql server lesson6
Sql server lesson6Sql server lesson6
Sql server lesson6
Ala Qunaibi
 

Similar to Local Secondary Indexes in Apache Phoenix (20)

Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, JapanApache Phoenix and HBase - Hadoop Summit Tokyo, Japan
Apache Phoenix and HBase - Hadoop Summit Tokyo, Japan
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
Interactive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using DruidInteractive Analytics at Scale in Apache Hive Using Druid
Interactive Analytics at Scale in Apache Hive Using Druid
 
Lightweight ETL pipelines with mara (PyData Berlin September Meetup)
Lightweight ETL pipelines with mara (PyData Berlin September Meetup)Lightweight ETL pipelines with mara (PyData Berlin September Meetup)
Lightweight ETL pipelines with mara (PyData Berlin September Meetup)
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
Hive 3 a new horizon
Hive 3  a new horizonHive 3  a new horizon
Hive 3 a new horizon
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Hbase mhug 2015
Hbase mhug 2015Hbase mhug 2015
Hbase mhug 2015
 
Ijebea14 228
Ijebea14 228Ijebea14 228
Ijebea14 228
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
 
HBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region ReplicasHBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region Replicas
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
 
IRJET- Rest API for E-Commerce Site
IRJET- Rest API for E-Commerce SiteIRJET- Rest API for E-Commerce Site
IRJET- Rest API for E-Commerce Site
 
War of the Indices- SQL vs. Oracle
War of the Indices-  SQL vs. OracleWar of the Indices-  SQL vs. Oracle
War of the Indices- SQL vs. Oracle
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Sql server lesson6
Sql server lesson6Sql server lesson6
Sql server lesson6
 

Recently uploaded

@Call @Girls in Saharanpur πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas...
 @Call @Girls in Saharanpur πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas... @Call @Girls in Saharanpur πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas...
@Call @Girls in Saharanpur πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas...
AlinaDevecerski
 
@Call @Girls in Surat πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Surat Avaulable
 @Call @Girls in Surat πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Surat Avaulable @Call @Girls in Surat πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Surat Avaulable
@Call @Girls in Surat πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Surat Avaulable
DiyaSharma6551
 
@Call @Girls in Ahmedabad πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Ahmedabad Ava...
 @Call @Girls in Ahmedabad πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Ahmedabad Ava... @Call @Girls in Ahmedabad πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Ahmedabad Ava...
@Call @Girls in Ahmedabad πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Ahmedabad Ava...
DiyaSharma6551
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
Hironori Washizaki
 
WEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service ProvidersWEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service Providers
Severalnines
 
Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
dachnug51 - HCL Domino Roadmap .pdf
dachnug51 - HCL Domino Roadmap      .pdfdachnug51 - HCL Domino Roadmap      .pdf
dachnug51 - HCL Domino Roadmap .pdf
DNUG e.V.
 
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile ServiceMumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
kolkata dolls
 
Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
e-Definers Technology
 
Kolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model Safe
Kolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model SafeKolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model Safe
Kolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model Safe
Misti Soneji
 
@Call @Girls in Tiruppur πŸ€·β€β™‚οΈ XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ...
 @Call @Girls in Tiruppur πŸ€·β€β™‚οΈ  XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ... @Call @Girls in Tiruppur πŸ€·β€β™‚οΈ  XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ...
@Call @Girls in Tiruppur πŸ€·β€β™‚οΈ XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ...
Mona Rathore
 
How we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hoursHow we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hours
Ortus Solutions, Corp
 
Panvel @Call @Girls Whatsapp 9833363713 With High Profile Offer
Panvel @Call @Girls Whatsapp 9833363713 With High Profile OfferPanvel @Call @Girls Whatsapp 9833363713 With High Profile Offer
Panvel @Call @Girls Whatsapp 9833363713 With High Profile Offer
$A19
 
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdfWhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
onemonitarsoftware
 
Ghatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai AvailableGhatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai Available
aviva54
 
Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01
williamrobertherman
 
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Asher Sterkin
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
DNUG e.V.
 
dachnug51 - HCLs evolution of the employee experience platform.pdf
dachnug51 - HCLs evolution of the employee experience platform.pdfdachnug51 - HCLs evolution of the employee experience platform.pdf
dachnug51 - HCLs evolution of the employee experience platform.pdf
DNUG e.V.
 
Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.
shivamt017
 

Recently uploaded (20)

@Call @Girls in Saharanpur πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas...
 @Call @Girls in Saharanpur πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas... @Call @Girls in Saharanpur πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas...
@Call @Girls in Saharanpur πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Tanisha Sharma Best High Clas...
 
@Call @Girls in Surat πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Surat Avaulable
 @Call @Girls in Surat πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Surat Avaulable @Call @Girls in Surat πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Surat Avaulable
@Call @Girls in Surat πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Surat Avaulable
 
@Call @Girls in Ahmedabad πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Ahmedabad Ava...
 @Call @Girls in Ahmedabad πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Ahmedabad Ava... @Call @Girls in Ahmedabad πŸ±β€πŸ‰  XXXXXXXXXX πŸ±β€πŸ‰  Best High Class Ahmedabad Ava...
@Call @Girls in Ahmedabad πŸ±β€πŸ‰ XXXXXXXXXX πŸ±β€πŸ‰ Best High Class Ahmedabad Ava...
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
 
WEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service ProvidersWEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service Providers
 
Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)Β₯) **Effective Abortion Pills...
 
dachnug51 - HCL Domino Roadmap .pdf
dachnug51 - HCL Domino Roadmap      .pdfdachnug51 - HCL Domino Roadmap      .pdf
dachnug51 - HCL Domino Roadmap .pdf
 
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile ServiceMumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
 
Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
 
Kolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model Safe
Kolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model SafeKolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model Safe
Kolkata @β„‚all @Girls ꧁❀ 000000000 ❀꧂@β„‚all @Girls Service Vip Top Model Safe
 
@Call @Girls in Tiruppur πŸ€·β€β™‚οΈ XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ...
 @Call @Girls in Tiruppur πŸ€·β€β™‚οΈ  XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ... @Call @Girls in Tiruppur πŸ€·β€β™‚οΈ  XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ...
@Call @Girls in Tiruppur πŸ€·β€β™‚οΈ XXXXXXXX πŸ€·β€β™‚οΈ Tanisha Sharma Best High Class ...
 
How we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hoursHow we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hours
 
Panvel @Call @Girls Whatsapp 9833363713 With High Profile Offer
Panvel @Call @Girls Whatsapp 9833363713 With High Profile OfferPanvel @Call @Girls Whatsapp 9833363713 With High Profile Offer
Panvel @Call @Girls Whatsapp 9833363713 With High Profile Offer
 
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdfWhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
 
Ghatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai AvailableGhatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls πŸ›΄ 9930687706 πŸ›΄ Aaradhaya Best High Class Mumbai Available
 
Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01
 
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
 
dachnug51 - HCLs evolution of the employee experience platform.pdf
dachnug51 - HCLs evolution of the employee experience platform.pdfdachnug51 - HCLs evolution of the employee experience platform.pdf
dachnug51 - HCLs evolution of the employee experience platform.pdf
 
Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.
 

Local Secondary Indexes in Apache Phoenix

  • 1. Local Secondary Indexes in Apache Phoenix Rajeshbabu Chintaguntla PhoenixCon 2017
  • 2. 2 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda Local Indexes Introduction Local indexes design and data model Local index writes and reads Performance Results Helpful Tips or recommendations
  • 3. 3 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Secondary indexes in Phoenix  Primary Key columns in a phoenix table forms HBase row key which acts as a primary index so filtering by primary key columns become point or range scans to the table.  Filtering on non primary key column converts query into full table scans and consume lot time and resources.  With secondary indexes, we can create alternative access paths to convert queries into point lookups or range scans.  Phoenix supports two kinds of indexes GLOBAL and LOCAL.  Phoenix supports Functional indexes as well.
  • 4. 4 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Local Secondary Indexes - Introduction  Local secondary index is LOCAL in the sense that a REGION in a table is considered as a unit and create and maintain index of it’s data.  The local index data is stored and maintained in the shadow column family(ies) in the same table.  So the index is 100% co-reside in the same server serving the actual data.  Faster index building.  Syntax:
  • 5. 5 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Local Secondary Index - Introduction Order Id Customer ID Item ID Date 100 11 1111 06/10/2017 101 23 1231 06/01/2017 102 11 1332 05/31/2017 103 34 3221 06/01/2017 Region[100 ,104) Region[104 ,107) REGION START KEY IDX ID DATE Order ID 100 1 05/31/2017 102 100 1 06/01/2017 101 100 1 06/01/2017 103 100 1 06/10/2017 100 104 55 1343 05/28/2017 105 11 2312 06/01/2017 106 29 1234 05/15/2017 104 1 05/15/2017 106 104 1 05/28/2017 104 104 1 06/01/2017 105 CREATE TABLE IF NOT EXISTS ORDERS( ORDER_ID LONG NOT NULL PRIMARY KEY, CUSTOMER_ID LONG NOT NULL, ITEM_ID INTEGER NOT NULL, DATE DATE NOT NULL); CREATE LOCAL INDEX IDX ON ORDERS(DATE) Index of Region[100, 104) Index of Region[104,107) BASE TABLE DATA – ORDER ID IS PRIMARY KEY INDEX ROW KEY
  • 6. 6 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Table Region1 0 L# 0 STATS CREATE TABLE IF NOT EXISTS WEB_STAT ( HOST CHAR(2) NOT NULL, DOMAIN VARCHAR NOT NULL, FEATURE VARCHAR NOT NULL, DATE DATE NOT NULL, STATS.ACTIVE_VISITOR INTEGER CONSTRAINT PK PRIMARY KEY (HOST, DOMAIN)); Region2 0 L# 0 STATS 2) CREATE LOCAL INDEX IDX2 ON WEB_STAT(STATS.ACTIVE_VISITOR) INCLUDE(DATE) Table Region1 0 STATS Region2 0 L# 0 STATS 3) CREATE LOCAL INDEX IDX3 ON WEB_STAT(DATE) INCLUDE(STATS.ACTIVE_VISITOR) L#STATS L# 0 L#STATS Data Model Shadow column families to store the index data 1) CREATE LOCAL INDEX IDX ON WEB_STAT(DATE)
  • 7. 7 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Data Model REGION START KEY SALT NUMBER (Empty for non salt table) INDEX ID TENANT_ID (Empty for non multi tenant table) INDEXED COLUMN VALUE[S] PRIMARY KEY COLUMN VALUE[S] Local index row key format οƒ˜ REGION START KEY: Start key of data region. For first region it’s empty byte array of region end key length. This helps to index region wise data. οƒ˜ SALT NUMBER: A byte value represents a salt bucket number calculated for index row key. οƒ˜ INDEX ID: A short number represents the local index. This helps to store each index data together. οƒ˜ TENANT_ID: Tenant column value of the row key. It’s empty for if a table is not multi-tenant
  • 8. 8 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Write path Region Server Region CLIENT 1.Write request prepare index updates Data cf Index cf 2.batch call Mem Store Me mSto re Index updates Data updates 4.Merge data and index updates 5.Write to MemStores WAL 6.Write to WAL 100% ATOMIC and CONSISTENT local index updates with data updates
  • 9. 9 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Regionserver Region [β€˜β€™,F) Region [F,L) Client 0 L#0 Region [L,R) Region [R,’’) Regionserver Read Path 0 L#0 0 L#0 0 L#0 SELECT COUNT(*) FROM T WHERE INDEXED_COL=β€˜findme’ 2 1 0 5
  • 10. 10 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Read Path SELECT INDEX_COL, NON_INDEX_COL FROM T WHERE INDEX_COL=β€˜findme’ Joining back missing columns from data table Region CLIENT 1.SCAN,L#0,FILTER Index cf Data cf Mem Store Me mSto re 2.Apply filter on index col 3.Get non index cols on matching rows 4.Merge with index cols 5.Return combined results to client 6. Results
  • 11. 11 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Region Splits and Merges  Since the indexes also stored in the same table, splits and merges taken care by HBase automatically.  We have special mechanism to separate HFile into child regions after split. We scan through each key value find the data row key from it and write to corresponding child region
  • 12. 12 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Performance Results  4 node cluster  Tested with 5 local indexes on the base table of 25 columns with 10 regions.  Ingested 50M rows.  3x faster upsert time comparing to global indexes  5x less network RX/TX utilizations during write comparing to global indexes  Similar read performance comparing to global indexes with queries like aggregations, group by, limit etc.
  • 13. 13 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Performance results Write performance
  • 14. 14 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Performance results Network Tx/Rx during write
  • 15. 15 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Performance results Network Tx/Rx during write
  • 16. 16 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Performance results Network Tx/Rx during write
  • 17. 17 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Helpful Tips  Mutable vs Immutable rows table? – Writes are much more faster with local indexes on immutable rows table than mutable. So if the row written once and never updated then better to create table with IMMUTABLE_ROWS property.  Online vs Offline index population? – When a table with pre-existing data then index population time may vary depending on the data size. – Usually index population happen at server by reading data table and writing index to the same table. It works very fast normally. But if the data size is too big then better to use ASYNC population by using IndexTool.  Covered index vs non covered index? – When a query contains the non indexed columns to access then Phoenix joins the missing columns(in the index) from data table itself by using get calls. If the matching number of rows are high better to create covered index to avoid get calls.
  • 18. 18 Β© Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You Q & A? rajeshbabu@apache.org @rajeshhcu32