SlideShare a Scribd company logo
(Big Data)2
How YARN Timeline Service v.2 Unlocks 360-Degree
Platform Insights at Scale
Sangjin Lee @sjlee (Twitter)
Joep Rottinghuis @joep (Twitter)
Outline
• Why v.2?
• Highlights
• Developing for Timeline Service v.2
• Setting up Timeline Service v.2
• Milestones
• Demo
Why v.2?
• YARN Timeline Service v 1.x
• Gained good adoption: Tez, HIVE, Pig, etc.
• Keeps improving with v 1.5 APIs and storage implementation
• Still facing some fundamental challenges...
Why v.2?
• Scalability and reliability challenges
• Single instance of Timeline Server
• Storage (single local LevelDB instance)
• Usability
• Flow
• Metrics and configuration as first-class citizens
• Metrics aggregation up the entity hierarchy

Recommended for you

Amazon API Gateway
Amazon API GatewayAmazon API Gateway
Amazon API Gateway

API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. It allows hosting multiple API versions and stages, generating SDKs, adding authentication, throttling requests, and caching responses to improve performance and reduce latency. API Gateway supports building and deploying REST and WebSocket APIs. Pricing is based on the number of API calls and amount of data transferred out. Optional dedicated caching tiers are also available.

restapisapi gateway
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform

This document discusses how Spring Boot and Kafka can form the basis of a new enterprise application platform focused on continuous delivery, event-driven architectures, and streaming data. It provides examples of companies that have successfully adopted this approach, such as Netflix transitioning to Spring Boot and a banking brand building a new core banking system using Spring Streams and Kafka. The document advocates an "event-first" and microservices-oriented mindset enabled by a streaming data platform and suggests that Spring Boot, Kafka, and related technologies provide a turnkey solution for implementing this new application development approach at large enterprises.

kafkaspring bootspring
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud

This document provides an overview and summary of Amazon S3 best practices and tuning for Hadoop/Spark in the cloud. It discusses the relationship between Hadoop/Spark and S3, the differences between HDFS and S3 and their use cases, details on how S3 behaves from the perspective of Hadoop/Spark, well-known pitfalls and tunings related to S3 consistency and multipart uploads, and recent community activities related to S3. The presentation aims to help users optimize their use of S3 storage with Hadoop/Spark frameworks.

awsemrhadoop
Highlights
v.1 v.2
Single writer/reader Timeline Server Distributed writer/collector architecture
Single local LevelDB storage* Scalable storage (HBase)
v.1 entity model New v.2 entity model
No aggregation Metrics aggregation
REST API Richer query REST API
Architecture
• Separation of writers (“collectors”) and readers
• Distributed collectors: one collector for each app
• Dedicated RM collector for RM-generated data
• Collector discovery via RM
• Pluggable storage with HBase as default storage
Distributed collectors & readers
What is a flow?
• A flow is a group of YARN
applications that are launched as
parts of a logical app
• Oozie, Scalding, Pig, etc.
• name:
“frequent_visitor_stat”
• run id: 1466097809000
• version: “b9b9068”

Recommended for you

A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi

Recently, a set of modern table formats such as Delta Lake, Hudi, Iceberg spring out. Along with Hive Metastore these table formats are trying to solve problems that stand in traditional data lake for a long time with their declared features like ACID, schema evolution, upsert, time travel, incremental consumption etc.

spark + ai summit

 *
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers

Data Lakes have been built with a desire to democratize data - to allow more and more people, tools, and applications to make use of data. A key capability needed to achieve it is hiding the complexity of underlying data structures and physical data storage from users. The de-facto standard has been the Hive table format addresses some of these problems but falls short at data, user, and application scale. So what is the answer? Apache Iceberg. Apache Iceberg table format is now in use and contributed to by many leading tech companies like Netflix, Apple, Airbnb, LinkedIn, Dremio, Expedia, and AWS. Watch Alex Merced, Developer Advocate at Dremio, as he describes the open architecture and performance-oriented capabilities of Apache Iceberg. You will learn: • The issues that arise when using the Hive table format at scale, and why we need a new table format • How a straightforward, elegant change in table format structure has enormous positive effects • The underlying architecture of an Apache Iceberg table, how a query against an Iceberg table works, and how the table’s underlying structure changes as CRUD operations are done on it • The resulting benefits of this architectural design

high throughput and low latencyp99p99 conf
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers

The presentation at CloudNativeCon Europe 2017, about how to design logging platform in container and microservices based computing platforms.

cncfcloudnativeconfluentd
Configuration and metrics
• Now explicit top-level attributes of
entities
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query entities where config A = B”
Configuration and metrics
• Now explicit top-level attributes of
entities
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query entities where config A = B”
HBase Storage
• Scalable backend
• Row Key structure
• efficient range scans
• KeyPrefixRegionSplitPolicy
• Filter pushdown
• Coprocessors for flow aggregation (“readless” aggregation)
• Cell tags for metadata (application id, aggregation operation)
• Cell timestamps generated during put
• left shifted with app id added to avoid overwrites
Tables in HBase
• flow run
• application
• entity
• flow activity
• app to flow

Recommended for you

Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?

Kafka Streams is a client library for building distributed applications that process streaming data stored in Apache Kafka. It provides a high-level streams DSL that allows developers to express streaming applications as set of processing steps. Alternatively, developers can use the lower-level processor API to implement custom business logic. Kafka Streams handles tasks like fault-tolerance, scalability and state management. It represents data as streams for unbounded data or tables for bounded state. Common operations include transformations, aggregations, joins and table operations.

apache kafkaevent streamingkafka streams
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)

Memory leaks are not always simple or easy to find. Heap dumps from production systems are often gigantic (4+ gigs) with millions of objects in memory. Simple spot checking with traditional tools is woefully inadequate in these situations, especially with real data. Leaks can be entire object graphs with enormous amounts of noise. This session will show you how to build custom tools using the Apache NetBeans Profiler/Heapwalker APIs. Using these APIs, you can read and analyze Java heaps programmatically to ask really hard questions. This gives you the power to analyze complex object graphs with tens of thousands of objects in seconds.

javaperformancenetbeans
Hive tuning
Hive tuningHive tuning
Hive tuning

This document provides an overview of Hive and its performance capabilities. It discusses Hive's SQL interface for querying large datasets stored in Hadoop, its architecture which compiles SQL queries into MapReduce jobs, and its support for SQL semantics and datatypes. The document also covers techniques for optimizing Hive performance, including data abstractions like partitions, buckets and skews. It describes different join strategies in Hive like shuffle joins, broadcast joins and sort-merge bucket joins and how they are implemented in MapReduce. The overall presentation aims to explain how Hive provides scalable SQL processing for big data.

table: flow run
Row key:
clusterId!userName!flo
wName!inverted(flowRun
Id)
• most recent flow run stored first
• coprocessor enabled
table: application
Row key:
clusterId!userName!flowN
ame!inverted(flowRunId)!
AppId
• applications within a flow run stored
together
• most recent flow run stored first
table: entity
Row key:
userName!clusterId!flowName!inverted(flo
wRunId)!AppId!entityType!entityId
• entities within an application within a flow run stored together per
type
• for example, all containers within a yarn application will be stored
together
• pre-split table
• stores information per entity run like info, relatesTo, relatedTo,
events, metrics, config
table: flow activity
Row key:
clusterId!inverted(TopOfTh
eDay)!userName!flowName
• shows the flows that ran on that day
• stores information per flow like number of
runs, the run ids, versions

Recommended for you

Terraform introduction
Terraform introductionTerraform introduction
Terraform introduction

This document introduces infrastructure as code (IaC) using Terraform and provides examples of deploying infrastructure on AWS including: - A single EC2 instance - A single web server - A cluster of web servers using an Auto Scaling Group - Adding a load balancer using an Elastic Load Balancer It also discusses Terraform concepts and syntax like variables, resources, outputs, and interpolation. The target audience is people who deploy infrastructure on AWS or other clouds.

Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber

This document discusses Pinot, Uber's real-time analytics platform. It provides an overview of Pinot's architecture and data ingestion process, describes a case study on modeling trip data in Pinot, and benchmarks Pinot's performance on ingesting large volumes of data and answering queries in real-time.

Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...

Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Current 2022 Can you answer how a given event came to be? Is it an aggregation, a combination of multiple events with different sources? What are its origins? Given the growing complexity of event streaming architectures - stateful processing, joins, fan-outs, multi-cluster flows - it is increasingly important to be able to accurately answer those questions, understand data flows and capture data provenance. This talk will walk through how to use and extend OpenTelemetry Java agent auto instrumentation to achieve full end-to-end traceability in Kafka event streaming architectures involving multi-cluster deployments, the Connect platform, stateful KStream applications and ksqlDB workloads. We will cover: - Distributed Tracing concepts - context propagation and the OpenTelemetry implementation stack; - Java agent auto instrumentation, problems faced when instrumenting service platforms (Connect and ksqlDB), stateful applications (KStreams and ksqlDB) and how auto instrumentation can be extended using loadable extensions to solve those problems; - Demo of an end-to-end tracing implementation and a highlight of the interesting use cases it enables.

event streaming architecturesend-to-end traceabilitykafka events
table: appToFlow
Row key:
clusterId!appId
- stores mapping of appId to
flowName and flowRunId
Metrics aggregation
• Application level
• Rolls up sub-application metrics
• Performed in real time in the collectors in memory
• Flow run level
• Rolls up app level metrics
• Performed in HBase region servers via coprocessors
• Offline aggregation (TBD)
• Rolls up on user, queue, and flow offline periodically
• Phoenix tables
FlowRun
Aggregation
via the HBase
Coprocessor
App
Metrics
Cells
in
HBase
FlowRun
Metric
Sum
App
Metrics
Cells
in
HBase
FlowRun
Metric
Sum
FlowRun
Aggregation
via the HBase
Coprocessor

Recommended for you

Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive

This document provides a summary of improvements made to Hive's performance through the use of Apache Tez and other optimizations. Some key points include: - Hive was improved to use Apache Tez as its execution engine instead of MapReduce, reducing latency for interactive queries and improving throughput for batch queries. - Statistics collection was optimized to gather column-level statistics from ORC file footers, speeding up statistics gathering. - The cost-based optimizer Optiq was added to Hive, allowing it to choose better execution plans. - Vectorized query processing, broadcast joins, dynamic partitioning, and other optimizations improved individual query performance by over 100x in some cases.

hivetezapache
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar

What is Apache Pulsar? How does it differ from other event streaming technologies available? StreamNative Developer Advocate Tim Spann will walk you through the features and architecture of this increasingly popular event streaming system, along with best practices for streaming and storing your data.

ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data

ORC files were originally introduced in Hive, but have now migrated to an independent Apache project. This has sped up the development of ORC and simplified integrating ORC into other projects, such as Hadoop, Spark, Presto, and Nifi. There are also many new tools that are built on top of ORC, such as Hive’s ACID transactions and LLAP, which provides incredibly fast reads for your hot data. LLAP also provides strong security guarantees that allow each user to only see the rows and columns that they have permission for. This talk will discuss the details of the ORC and Parquet formats and what the relevant tradeoffs are. In particular, it will discuss how to format your data and the options to use to maximize your read performance. In particular, we’ll discuss when and how to use ORC’s schema evolution, bloom filters, and predicate push down. It will also show you how to use the tools to translate ORC files into human-readable formats, such as JSON, and display the rich metadata from the file including the type in the file and min, max, and count for each column.

dataworks summitdataworks summit 2017hadoop summit
Reader REST API: paths
• URLs under /ws/v2/timeline
• Canonical REST style URLs:
/ws/v2/timeline/clusters/cluster_name/users/user_name/flows/flow_n
ame/runs/run_id
• Path elements may be omitted if they can be inferred
• flow context can be inferred by app id
• default cluster is assumed if cluster is omitted
Setting up Timeline Service v.2
• Set up the HBase cluster (1.1.x)
• Add the timeline service jar to HBase
• Install the flow run coprocessor
• Create tables via TimelineSchemaCreator utility
• Configure the YARN cluster
• Enable Timeline Service v.2
• Add hbase-site.xml for the timeline collector and readers
• Start the timeline reader daemon
Milestone 1 ("Alpha 1")
• Merge discussion (YARN-2928) in progress as we speak!
✓ Complete end-to-end read/write flow
✓ Real time application and flow
aggregation
✓ New entity model
✓ HBase Storage
✓ Rich REST API
✓ Integration with Distributed Shell
and MapReduce
✓ YARN generic events and system
metrics
Milestones - Future
• Milestone 2 (“Alpha 2”)
• Integration with new YARN
UI
• Integration with more
frameworks
• Beta
• Freeze API and storage schema
• Security
• Collectors as containers
• Storage fault tolerance
• Production-ready
• Migration-ready

Recommended for you

Kafka presentation
Kafka presentationKafka presentation
Kafka presentation

Kafka is an open source messaging system that can handle massive streams of data in real-time. It is fast, scalable, durable, and fault-tolerant. Kafka is commonly used for stream processing, website activity tracking, metrics collection, and log aggregation. It supports high throughput, reliable delivery, and horizontal scalability. Some examples of real-time use cases for Kafka include website monitoring, network monitoring, fraud detection, and IoT applications.

Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

In this talk by Sean Glover, Principal Engineer at Lightbend, we will review how the Strimzi Kafka Operator, a supported technology in Lightbend Platform, makes many operational tasks in Kafka easy, such as the initial deployment and updates of a Kafka and ZooKeeper cluster. See the blog post containing the YouTube video here: https://www.lightbend.com/blog/running-kafka-on-kubernetes-with-strimzi-for-real-time-streaming-applications

kafkakubernetesk8s
Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)

This document summarizes the new YARN Timeline Service version 2, which was developed to address scalability, reliability, and usability challenges in version 1. Key highlights of version 2 include a distributed collector architecture for scalable and fault-tolerant writing of timeline data, an entity data model with first-class configuration and metrics support, and metrics aggregation capabilities. It stores data in HBase for scalability and provides a richer REST API for querying. Milestone goals include integration with more frameworks and production readiness.

hadoopyarn
Contributors
• Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks)
• Varun Saxena, Naganarasimha G. R. (Huawei)
• Sangjin Lee, Vrushali Channapattan, Joep Rottinghuis (Twitter)
• Zhijie Shen (now at Facebook)
• The HBase and Phoenix community!
Thank you!

More Related Content

What's hot

File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
Rico Chen
 
gRPC
gRPCgRPC
Amazon API Gateway
Amazon API GatewayAmazon API Gateway
Amazon API Gateway
Mark Bate
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform
VMware Tanzu
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
ScyllaDB
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
SATOSHI TAGOMORI
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
confluent
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)
Ryan Cuprak
 
Hive tuning
Hive tuningHive tuning
Hive tuning
Michael Zhang
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introduction
Jason Vance
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
Xiang Fu
 
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
HostedbyConfluent
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
t3rmin4t0r
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
ScyllaDB
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
Mohammed Fazuluddin
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Lightbend
 

What's hot (20)

File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
gRPC
gRPCgRPC
gRPC
 
Amazon API Gateway
Amazon API GatewayAmazon API Gateway
Amazon API Gateway
 
Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introduction
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
Implementing End-To-End Tracing With Roman Kolesnev and Antony Stubbs | Curre...
 
Hive+Tez: A performance deep dive
Hive+Tez: A performance deep diveHive+Tez: A performance deep dive
Hive+Tez: A performance deep dive
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 

Similar to HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)
Sangjin Lee
 
Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016
Vrushali Channapattan
 
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...
Zhijie Shen
 
The Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseThe Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBase
DataWorks Summit
 
WSO2 Quarterly Technical Update
WSO2 Quarterly Technical UpdateWSO2 Quarterly Technical Update
WSO2 Quarterly Technical Update
WSO2
 
Andrei shakirin rest_cxf
Andrei shakirin rest_cxfAndrei shakirin rest_cxf
Andrei shakirin rest_cxf
Aravindharamanan S
 
Apache Hadoop YARN State of the Union
Apache Hadoop YARN State of the UnionApache Hadoop YARN State of the Union
Apache Hadoop YARN State of the Union
Weiwei Yang
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
Amazon Web Services
 
Rails Request & Middlewares
Rails Request & MiddlewaresRails Request & Middlewares
Rails Request & Middlewares
Santosh Wadghule
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
Kai Sasaki
 
What's New in .Net 4.5
What's New in .Net 4.5What's New in .Net 4.5
What's New in .Net 4.5
Malam Team
 
Building Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak CoreBuilding Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak Core
Andy Gross
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
Evan Chan
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
VARUN SAXENA
 
Introducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load BalancerIntroducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load Balancer
WSO2
 
Cloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for HadoopCloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for Hadoop
Cloudera, Inc.
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
Hao H. Zhang
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
lisanl
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
trihug
 
What will be new in Apache NiFi 1.2.0
What will be new in Apache NiFi 1.2.0What will be new in Apache NiFi 1.2.0
What will be new in Apache NiFi 1.2.0
Koji Kawamura
 

Similar to HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale (20)

Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)
 
Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016
 
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Ha...
 
The Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseThe Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBase
 
WSO2 Quarterly Technical Update
WSO2 Quarterly Technical UpdateWSO2 Quarterly Technical Update
WSO2 Quarterly Technical Update
 
Andrei shakirin rest_cxf
Andrei shakirin rest_cxfAndrei shakirin rest_cxf
Andrei shakirin rest_cxf
 
Apache Hadoop YARN State of the Union
Apache Hadoop YARN State of the UnionApache Hadoop YARN State of the Union
Apache Hadoop YARN State of the Union
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 
Rails Request & Middlewares
Rails Request & MiddlewaresRails Request & Middlewares
Rails Request & Middlewares
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
 
What's New in .Net 4.5
What's New in .Net 4.5What's New in .Net 4.5
What's New in .Net 4.5
 
Building Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak CoreBuilding Distributed Systems With Riak and Riak Core
Building Distributed Systems With Riak and Riak Core
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
Introducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load BalancerIntroducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load Balancer
 
Cloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for HadoopCloudera Impala: A Modern SQL Engine for Hadoop
Cloudera Impala: A Modern SQL Engine for Hadoop
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
 
What will be new in Apache NiFi 1.2.0
What will be new in Apache NiFi 1.2.0What will be new in Apache NiFi 1.2.0
What will be new in Apache NiFi 1.2.0
 

More from Michael Stack

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloudhbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
Michael Stack
 
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinteresthbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinterest
Michael Stack
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
Michael Stack
 
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didihbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
Michael Stack
 
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
Michael Stack
 
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencent
Michael Stack
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
Michael Stack
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
Michael Stack
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
Michael Stack
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
Michael Stack
 
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomihbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
Michael Stack
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
Michael Stack
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
Michael Stack
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
Michael Stack
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
Michael Stack
 
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLhbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
Michael Stack
 
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBasehbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
Michael Stack
 
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
Michael Stack
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
Michael Stack
 
HBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteHBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
Michael Stack
 

More from Michael Stack (20)

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloudhbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
 
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinteresthbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinterest
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
 
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didihbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
 
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
 
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencent
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomihbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
 
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLhbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
 
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBasehbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
 
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
 
HBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteHBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
 

Recently uploaded

Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeBangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
bookhotbebes1
 
Understanding Cybersecurity Breaches: Causes, Consequences, and Prevention
Understanding Cybersecurity Breaches: Causes, Consequences, and PreventionUnderstanding Cybersecurity Breaches: Causes, Consequences, and Prevention
Understanding Cybersecurity Breaches: Causes, Consequences, and Prevention
Bert Blevins
 
system structure in operating systems.pdf
system structure in operating systems.pdfsystem structure in operating systems.pdf
system structure in operating systems.pdf
zyroxsunny
 
Application Infrastructure and cloud computing.pdf
Application Infrastructure and cloud computing.pdfApplication Infrastructure and cloud computing.pdf
Application Infrastructure and cloud computing.pdf
Mithun Chakroborty
 
@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time
Escorts service
 
IWISS Catalog 2024
IWISS Catalog 2024IWISS Catalog 2024
IWISS Catalog 2024
Iwiss Tools Co.,Ltd
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
IIIT Hyderabad
 
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
VICTOR MAESTRE RAMIREZ
 
Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.
Tool and Die Tech
 
Net Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK EmpireNet Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK Empire
Global Network for Zero
 
Lecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............pptLecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............ppt
RujanTimsina1
 
Lecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdfLecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdf
peacekipu
 
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model SafeRohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
binna singh$A17
 
Introduction to neural network (Module 1).pptx
Introduction to neural network (Module 1).pptxIntroduction to neural network (Module 1).pptx
Introduction to neural network (Module 1).pptx
archanac21
 
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdfOCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
Muanisa Waras
 
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdfGUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
ProexportColombia1
 
Literature Reivew of Student Center Design
Literature Reivew of Student Center DesignLiterature Reivew of Student Center Design
Literature Reivew of Student Center Design
PriyankaKarn3
 
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
kinni singh$A17
 
South Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
South Mumbai @Call @Girls Whatsapp 9930687706 With High Profile ServiceSouth Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
South Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
kolkata dolls
 
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-IDUNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
GOWSIKRAJA PALANISAMY
 

Recently uploaded (20)

Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeBangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Bangalore @ℂall @Girls ꧁❤ 0000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
 
Understanding Cybersecurity Breaches: Causes, Consequences, and Prevention
Understanding Cybersecurity Breaches: Causes, Consequences, and PreventionUnderstanding Cybersecurity Breaches: Causes, Consequences, and Prevention
Understanding Cybersecurity Breaches: Causes, Consequences, and Prevention
 
system structure in operating systems.pdf
system structure in operating systems.pdfsystem structure in operating systems.pdf
system structure in operating systems.pdf
 
Application Infrastructure and cloud computing.pdf
Application Infrastructure and cloud computing.pdfApplication Infrastructure and cloud computing.pdf
Application Infrastructure and cloud computing.pdf
 
@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Kochi 🚒 XXXXXXXXXX 🚒 Priya Sharma Beautiful And Cute Girl any Time
 
IWISS Catalog 2024
IWISS Catalog 2024IWISS Catalog 2024
IWISS Catalog 2024
 
Response & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITHResponse & Safe AI at Summer School of AI at IIITH
Response & Safe AI at Summer School of AI at IIITH
 
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
Advances in Detect and Avoid for Unmanned Aircraft Systems and Advanced Air M...
 
Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.Trends in Computer Aided Design and MFG.
Trends in Computer Aided Design and MFG.
 
Net Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK EmpireNet Zero Case Study: SRK House and SRK Empire
Net Zero Case Study: SRK House and SRK Empire
 
Lecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............pptLecture 3 Biomass energy...............ppt
Lecture 3 Biomass energy...............ppt
 
Lecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdfLecture 6 - The effect of Corona effect in Power systems.pdf
Lecture 6 - The effect of Corona effect in Power systems.pdf
 
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model SafeRohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
Rohini @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Yogita Mehra Top Model Safe
 
Introduction to neural network (Module 1).pptx
Introduction to neural network (Module 1).pptxIntroduction to neural network (Module 1).pptx
Introduction to neural network (Module 1).pptx
 
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdfOCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
OCS Training - Rig Equipment Inspection - Advanced 5 Days_IADC.pdf
 
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdfGUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
GUIA_LEGAL_CHAPTER-9_COLOMBIAN ELECTRICITY (1).pdf
 
Literature Reivew of Student Center Design
Literature Reivew of Student Center DesignLiterature Reivew of Student Center Design
Literature Reivew of Student Center Design
 
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
( Call  ) Girls Vasant Kunj Just 9873940964 High Class Model Shneha Patil
 
South Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
South Mumbai @Call @Girls Whatsapp 9930687706 With High Profile ServiceSouth Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
South Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
 
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-IDUNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
UNIT I INCEPTION OF INFORMATION DESIGN 20CDE09-ID
 

HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

  • 1. (Big Data)2 How YARN Timeline Service v.2 Unlocks 360-Degree Platform Insights at Scale Sangjin Lee @sjlee (Twitter) Joep Rottinghuis @joep (Twitter)
  • 2. Outline • Why v.2? • Highlights • Developing for Timeline Service v.2 • Setting up Timeline Service v.2 • Milestones • Demo
  • 3. Why v.2? • YARN Timeline Service v 1.x • Gained good adoption: Tez, HIVE, Pig, etc. • Keeps improving with v 1.5 APIs and storage implementation • Still facing some fundamental challenges...
  • 4. Why v.2? • Scalability and reliability challenges • Single instance of Timeline Server • Storage (single local LevelDB instance) • Usability • Flow • Metrics and configuration as first-class citizens • Metrics aggregation up the entity hierarchy
  • 5. Highlights v.1 v.2 Single writer/reader Timeline Server Distributed writer/collector architecture Single local LevelDB storage* Scalable storage (HBase) v.1 entity model New v.2 entity model No aggregation Metrics aggregation REST API Richer query REST API
  • 6. Architecture • Separation of writers (“collectors”) and readers • Distributed collectors: one collector for each app • Dedicated RM collector for RM-generated data • Collector discovery via RM • Pluggable storage with HBase as default storage
  • 8. What is a flow? • A flow is a group of YARN applications that are launched as parts of a logical app • Oozie, Scalding, Pig, etc. • name: “frequent_visitor_stat” • run id: 1466097809000 • version: “b9b9068”
  • 9. Configuration and metrics • Now explicit top-level attributes of entities • Fine-grained updates and queries made possible • “update metric A to value x” • “query entities where config A = B”
  • 10. Configuration and metrics • Now explicit top-level attributes of entities • Fine-grained updates and queries made possible • “update metric A to value x” • “query entities where config A = B”
  • 11. HBase Storage • Scalable backend • Row Key structure • efficient range scans • KeyPrefixRegionSplitPolicy • Filter pushdown • Coprocessors for flow aggregation (“readless” aggregation) • Cell tags for metadata (application id, aggregation operation) • Cell timestamps generated during put • left shifted with app id added to avoid overwrites
  • 12. Tables in HBase • flow run • application • entity • flow activity • app to flow
  • 13. table: flow run Row key: clusterId!userName!flo wName!inverted(flowRun Id) • most recent flow run stored first • coprocessor enabled
  • 14. table: application Row key: clusterId!userName!flowN ame!inverted(flowRunId)! AppId • applications within a flow run stored together • most recent flow run stored first
  • 15. table: entity Row key: userName!clusterId!flowName!inverted(flo wRunId)!AppId!entityType!entityId • entities within an application within a flow run stored together per type • for example, all containers within a yarn application will be stored together • pre-split table • stores information per entity run like info, relatesTo, relatedTo, events, metrics, config
  • 16. table: flow activity Row key: clusterId!inverted(TopOfTh eDay)!userName!flowName • shows the flows that ran on that day • stores information per flow like number of runs, the run ids, versions
  • 17. table: appToFlow Row key: clusterId!appId - stores mapping of appId to flowName and flowRunId
  • 18. Metrics aggregation • Application level • Rolls up sub-application metrics • Performed in real time in the collectors in memory • Flow run level • Rolls up app level metrics • Performed in HBase region servers via coprocessors • Offline aggregation (TBD) • Rolls up on user, queue, and flow offline periodically • Phoenix tables
  • 21. Reader REST API: paths • URLs under /ws/v2/timeline • Canonical REST style URLs: /ws/v2/timeline/clusters/cluster_name/users/user_name/flows/flow_n ame/runs/run_id • Path elements may be omitted if they can be inferred • flow context can be inferred by app id • default cluster is assumed if cluster is omitted
  • 22. Setting up Timeline Service v.2 • Set up the HBase cluster (1.1.x) • Add the timeline service jar to HBase • Install the flow run coprocessor • Create tables via TimelineSchemaCreator utility • Configure the YARN cluster • Enable Timeline Service v.2 • Add hbase-site.xml for the timeline collector and readers • Start the timeline reader daemon
  • 23. Milestone 1 ("Alpha 1") • Merge discussion (YARN-2928) in progress as we speak! ✓ Complete end-to-end read/write flow ✓ Real time application and flow aggregation ✓ New entity model ✓ HBase Storage ✓ Rich REST API ✓ Integration with Distributed Shell and MapReduce ✓ YARN generic events and system metrics
  • 24. Milestones - Future • Milestone 2 (“Alpha 2”) • Integration with new YARN UI • Integration with more frameworks • Beta • Freeze API and storage schema • Security • Collectors as containers • Storage fault tolerance • Production-ready • Migration-ready
  • 25. Contributors • Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks) • Varun Saxena, Naganarasimha G. R. (Huawei) • Sangjin Lee, Vrushali Channapattan, Joep Rottinghuis (Twitter) • Zhijie Shen (now at Facebook) • The HBase and Phoenix community!