(Big Data)2
How YARN Timeline Service v.2 Unlocks 360-Degree
Platform Insights at Scale
Sangjin Lee @sjlee (Twitter)
Joep Rottinghuis @joep (Twitter)
• Why v.2?
• Highlights
• Developing for Timeline Service v.2
• Setting up Timeline Service v.2
• Milestones
• Demo
Why v.2?
• YARN Timeline Service v 1.x
• Gained good adoption: Tez, HIVE, Pig, etc.
• Keeps improving with v 1.5 APIs and storage implementation
• Still facing some fundamental challenges...
Why v.2?
• Scalability and reliability challenges
• Single instance of Timeline Server
• Storage (single local LevelDB instance)
• Usability
• Flow
• Metrics and configuration as first-class citizens
• Metrics aggregation up the entity hierarchy

v.1 v.2
Single writer/reader Timeline Server Distributed writer/collector architecture
Single local LevelDB storage* Scalable storage (HBase)
v.1 entity model New v.2 entity model
No aggregation Metrics aggregation
REST API Richer query REST API
• Separation of writers (“collectors”) and readers
• Distributed collectors: one collector for each app
• Dedicated RM collector for RM-generated data
• Collector discovery via RM
• Pluggable storage with HBase as default storage
Distributed collectors & readers
What is a flow?
• A flow is a group of YARN
applications that are launched as
parts of a logical app
• Oozie, Scalding, Pig, etc.
• name:
• run id: 1466097809000
• version: “b9b9068”

Configuration and metrics
• Now explicit top-level attributes of
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query entities where config A = B”
Configuration and metrics
• Now explicit top-level attributes of
• Fine-grained updates and queries
made possible
• “update metric A to value x”
• “query entities where config A = B”
HBase Storage
• Scalable backend
• Row Key structure
• efficient range scans
• KeyPrefixRegionSplitPolicy
• Filter pushdown
• Coprocessors for flow aggregation (“readless” aggregation)
• Cell tags for metadata (application id, aggregation operation)
• Cell timestamps generated during put
• left shifted with app id added to avoid overwrites
Tables in HBase
• flow run
• application
• entity
• flow activity
• app to flow

table: flow run
Row key:
• most recent flow run stored first
• coprocessor enabled
table: application
Row key:
• applications within a flow run stored
• most recent flow run stored first
table: entity
Row key:
• entities within an application within a flow run stored together per
• for example, all containers within a yarn application will be stored
• pre-split table
• stores information per entity run like info, relatesTo, relatedTo,
events, metrics, config
table: flow activity
Row key:
• shows the flows that ran on that day
• stores information per flow like number of
runs, the run ids, versions

table: appToFlow
Row key:
- stores mapping of appId to
flowName and flowRunId
Metrics aggregation
• Application level
• Rolls up sub-application metrics
• Performed in real time in the collectors in memory
• Flow run level
• Rolls up app level metrics
• Performed in HBase region servers via coprocessors
• Offline aggregation (TBD)
• Rolls up on user, queue, and flow offline periodically
• Phoenix tables
via the HBase
via the HBase

Reader REST API: paths
• URLs under /ws/v2/timeline
• Canonical REST style URLs:
• Path elements may be omitted if they can be inferred
• flow context can be inferred by app id
• default cluster is assumed if cluster is omitted
Setting up Timeline Service v.2
• Set up the HBase cluster (1.1.x)
• Add the timeline service jar to HBase
• Install the flow run coprocessor
• Create tables via TimelineSchemaCreator utility
• Configure the YARN cluster
• Enable Timeline Service v.2
• Add hbase-site.xml for the timeline collector and readers
• Start the timeline reader daemon
Milestone 1 ("Alpha 1")
• Merge discussion (YARN-2928) in progress as we speak!
✓ Complete end-to-end read/write flow
✓ Real time application and flow
✓ New entity model
✓ HBase Storage
✓ Integration with Distributed Shell
and MapReduce
✓ YARN generic events and system
Milestones - Future
• Milestone 2 (“Alpha 2”)
• Integration with new YARN
• Integration with more
• Beta
• Freeze API and storage schema
• Security
• Collectors as containers
• Storage fault tolerance
• Production-ready
• Migration-ready

Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)

This document summarizes the new YARN Timeline Service version 2, which was developed to address scalability, reliability, and usability challenges in version 1. Key highlights of version 2 include a distributed collector architecture for scalable and fault-tolerant writing of timeline data, an entity data model with first-class configuration and metrics support, and metrics aggregation capabilities. It stores data in HBase for scalability and provides a richer REST API for querying. Milestone goals include integration with more frameworks and production readiness.

• Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks)
• Varun Saxena, Naganarasimha G. R. (Huawei)
• Sangjin Lee, Vrushali Channapattan, Joep Rottinghuis (Twitter)
• Zhijie Shen (now at Facebook)
• The HBase and Phoenix community!
Thank you!

HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform insights at scale

  • 1. (Big Data)2 How YARN Timeline Service v.2 Unlocks 360-Degree Platform Insights at Scale Sangjin Lee @sjlee (Twitter) Joep Rottinghuis @joep (Twitter)
  • 2. Outline • Why v.2? • Highlights • Developing for Timeline Service v.2 • Setting up Timeline Service v.2 • Milestones • Demo
  • 3. Why v.2? • YARN Timeline Service v 1.x • Gained good adoption: Tez, HIVE, Pig, etc. • Keeps improving with v 1.5 APIs and storage implementation • Still facing some fundamental challenges...
  • 4. Why v.2? • Scalability and reliability challenges • Single instance of Timeline Server • Storage (single local LevelDB instance) • Usability • Flow • Metrics and configuration as first-class citizens • Metrics aggregation up the entity hierarchy
  • 5. Highlights v.1 v.2 Single writer/reader Timeline Server Distributed writer/collector architecture Single local LevelDB storage* Scalable storage (HBase) v.1 entity model New v.2 entity model No aggregation Metrics aggregation REST API Richer query REST API
  • 6. Architecture • Separation of writers (“collectors”) and readers • Distributed collectors: one collector for each app • Dedicated RM collector for RM-generated data • Collector discovery via RM • Pluggable storage with HBase as default storage
  • 8. What is a flow? • A flow is a group of YARN applications that are launched as parts of a logical app • Oozie, Scalding, Pig, etc. • name: “frequent_visitor_stat” • run id: 1466097809000 • version: “b9b9068”
  • 9. Configuration and metrics • Now explicit top-level attributes of entities • Fine-grained updates and queries made possible • “update metric A to value x” • “query entities where config A = B”
  • 10. Configuration and metrics • Now explicit top-level attributes of entities • Fine-grained updates and queries made possible • “update metric A to value x” • “query entities where config A = B”
  • 11. HBase Storage • Scalable backend • Row Key structure • efficient range scans • KeyPrefixRegionSplitPolicy • Filter pushdown • Coprocessors for flow aggregation (“readless” aggregation) • Cell tags for metadata (application id, aggregation operation) • Cell timestamps generated during put • left shifted with app id added to avoid overwrites
  • 12. Tables in HBase • flow run • application • entity • flow activity • app to flow
  • 13. table: flow run Row key: clusterId!userName!flo wName!inverted(flowRun Id) • most recent flow run stored first • coprocessor enabled
  • 14. table: application Row key: clusterId!userName!flowN ame!inverted(flowRunId)! AppId • applications within a flow run stored together • most recent flow run stored first
  • 15. table: entity Row key: userName!clusterId!flowName!inverted(flo wRunId)!AppId!entityType!entityId • entities within an application within a flow run stored together per type • for example, all containers within a yarn application will be stored together • pre-split table • stores information per entity run like info, relatesTo, relatedTo, events, metrics, config
  • 16. table: flow activity Row key: clusterId!inverted(TopOfTh eDay)!userName!flowName • shows the flows that ran on that day • stores information per flow like number of runs, the run ids, versions
  • 17. table: appToFlow Row key: clusterId!appId - stores mapping of appId to flowName and flowRunId
  • 18. Metrics aggregation • Application level • Rolls up sub-application metrics • Performed in real time in the collectors in memory • Flow run level • Rolls up app level metrics • Performed in HBase region servers via coprocessors • Offline aggregation (TBD) • Rolls up on user, queue, and flow offline periodically • Phoenix tables
  • 21. Reader REST API: paths • URLs under /ws/v2/timeline • Canonical REST style URLs: /ws/v2/timeline/clusters/cluster_name/users/user_name/flows/flow_n ame/runs/run_id • Path elements may be omitted if they can be inferred • flow context can be inferred by app id • default cluster is assumed if cluster is omitted
  • 22. Setting up Timeline Service v.2 • Set up the HBase cluster (1.1.x) • Add the timeline service jar to HBase • Install the flow run coprocessor • Create tables via TimelineSchemaCreator utility • Configure the YARN cluster • Enable Timeline Service v.2 • Add hbase-site.xml for the timeline collector and readers • Start the timeline reader daemon
  • 23. Milestone 1 ("Alpha 1") • Merge discussion (YARN-2928) in progress as we speak! ✓ Complete end-to-end read/write flow ✓ Real time application and flow aggregation ✓ New entity model ✓ HBase Storage ✓ Rich REST API ✓ Integration with Distributed Shell and MapReduce ✓ YARN generic events and system metrics
  • 24. Milestones - Future • Milestone 2 (“Alpha 2”) • Integration with new YARN UI • Integration with more frameworks • Beta • Freeze API and storage schema • Security • Collectors as containers • Storage fault tolerance • Production-ready • Migration-ready
  • 25. Contributors • Li Lu, Junping Du, Vinod Kumar Vavilapalli (Hortonworks) • Varun Saxena, Naganarasimha G. R. (Huawei) • Sangjin Lee, Vrushali Channapattan, Joep Rottinghuis (Twitter) • Zhijie Shen (now at Facebook) • The HBase and Phoenix community!