SlideShare a Scribd company logo
Oozie High Availability (HA)
Robert Kanter
High Availability
• A system without non-planned downtime when
partial failures occur
• Typically achieved by having redundancies and removing
single-points of failure
• Our Goals
• Don’t change the API or usage patterns
• User doesn’t even have to know its HA
The HA Solution
Architectural Overview
The HA Solution: Database
• Oozie stores all state in a database
• (submitted jobs, workflow definitions, etc)
• Instead of a failover model, we want to run many
Oozie servers against the same database
• Active-Active HA
• Also provides horizontal scalability
• ZooKeeper for coordination
The HA Solution: Database
The HA Solution: Access
• Users and client programs need a single address to
connect (Web UI, REST/Java API, JobTracker callbacks,
• Load Balancer, Virtual IP, or DNS round-robin can be
used to provide a single entry point to the Oozie
• Technically also needs to be HA
The HA Solution: Access
The HA Solution: Log Streaming
• Oozie’s log files are not in the database
• Each Oozie Server only has access to its own logs
• Jobs are not assigned to a specific Oozie server
• What if Oozie Server A wants to get logs for a job
processed by Oozie Server B?
• Oozie Server A can ask Oozie Server B for its logs
• Caveat: If an Oozie Server goes down, any logs from it will
be unavailable until it is brought back up
The HA Solution: Log Streaming
How to Enable HA
Configuration and Security
How to Enable HA
• Setup Load balancer, ZooKeeper ensemble, HA database,
and multiple identically configured Oozie servers
• Enable Oozie HA services:
How to Enable HA
• Point Oozie to ZooKeeper Ensemble:
• Point environment variable for callbacks to load
export OOZIE_BASE_URL="http://loadbalancer:11000/oozie"
How to Enable HA: Security
• Extra step to configure Kerberos with Load Balancer:
• Note: this currently prevents clients from talking
directly to any Oozie server
How to Enable HA: Security
• Enable Kerberos connection to ZooKeeper and ACLs:
• ACLs prevent malicious users or programs from
interfering with Oozie’s znodes
Using Oozie with HA
Using Oozie with HA
• New Oozie CLI/REST API command to list all servers
$ oozie admin -oozie http://loadbalancer:11000/oozie -servers
hostA : http://hostA:11000/oozie
hostB : http://hostB:11000/oozie
hostC : http://hostC:11000/oozie
• Log messages now include which server wrote them
2013-09-29 16:46:20,182 WARN
SERVER[hostA] USER[root] GROUP[-] TOKEN[] APP[demo-wf]
JOB[0000000-130925230553293-oozie-oozi-W] ACTION[0000000-
130925230553293-oozie-oozi-W@streaming-node] [***0000000-
To Do
What’s left
To Do
• HA support for SLAs and HCatalog integration
• Sharelib Purging with HA
• Log Streaming HA
• With Kerberos, Oozie servers can’t talk to each other
• Breaks log streaming, sharelibupdate
• Other misc improvements

More Related Content

What's hot

Oozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeepOozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeep
Pradeep Pandey
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
Shalish VJ
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Chicago Hadoop Users Group
Data Pipeline Management Framework on Oozie
Data Pipeline Management Framework on OozieData Pipeline Management Framework on Oozie
Data Pipeline Management Framework on Oozie
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for HadoopMay 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
Yahoo Developer Network
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NApache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Yahoo Developer Network
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud InfrastructureSCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
Matt Ray
Oozie Summit 2011
Oozie Summit 2011Oozie Summit 2011
Oozie Summit 2011
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Shalin Shekhar Mangar
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deployment
Karthik .P.R
Oozie or Easy: Managing Hadoop Workloads the EASY Way
Oozie or Easy: Managing Hadoop Workloads the EASY WayOozie or Easy: Managing Hadoop Workloads the EASY Way
Oozie or Easy: Managing Hadoop Workloads the EASY Way
DataWorks Summit
Cool MariaDB Plugins
Cool MariaDB Plugins Cool MariaDB Plugins
Cool MariaDB Plugins
Colin Charles
SQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12cSQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12c
Tanel Poder
Hitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning toolsHitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning tools
Bjoern Rost
DevOps for DBAs
DevOps for DBAsDevOps for DBAs
DevOps for DBAs
Bjoern Rost
Reactive Jersey Client
Reactive Jersey ClientReactive Jersey Client
Reactive Jersey Client
Michal Gajdos
Gradle - Build System
Gradle - Build SystemGradle - Build System
Gradle - Build System
Jeevesh Pandey
Agile Database Development with Liquibase
Agile Database Development with LiquibaseAgile Database Development with Liquibase
Agile Database Development with Liquibase
Tim Berglund
Creating Modular Test-Driven SPAs with Spring and AngularJS
Creating Modular Test-Driven SPAs with Spring and AngularJSCreating Modular Test-Driven SPAs with Spring and AngularJS
Creating Modular Test-Driven SPAs with Spring and AngularJS
Gunnar Hillert

What's hot (20)

Oozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeepOozie &amp; sqoop by pradeep
Oozie &amp; sqoop by pradeep
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Data Pipeline Management Framework on Oozie
Data Pipeline Management Framework on OozieData Pipeline Management Framework on Oozie
Data Pipeline Management Framework on Oozie
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for HadoopMay 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas NApache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud InfrastructureSCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
SCALE12X Build a Cloud Day: Chef: The Swiss Army Knife of Cloud Infrastructure
Oozie Summit 2011
Oozie Summit 2011Oozie Summit 2011
Oozie Summit 2011
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Ansible for large scale deployment
Ansible for large scale deploymentAnsible for large scale deployment
Ansible for large scale deployment
Oozie or Easy: Managing Hadoop Workloads the EASY Way
Oozie or Easy: Managing Hadoop Workloads the EASY WayOozie or Easy: Managing Hadoop Workloads the EASY Way
Oozie or Easy: Managing Hadoop Workloads the EASY Way
Cool MariaDB Plugins
Cool MariaDB Plugins Cool MariaDB Plugins
Cool MariaDB Plugins
SQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12cSQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12c
Hitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning toolsHitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning tools
DevOps for DBAs
DevOps for DBAsDevOps for DBAs
DevOps for DBAs
Reactive Jersey Client
Reactive Jersey ClientReactive Jersey Client
Reactive Jersey Client
Gradle - Build System
Gradle - Build SystemGradle - Build System
Gradle - Build System
Agile Database Development with Liquibase
Agile Database Development with LiquibaseAgile Database Development with Liquibase
Agile Database Development with Liquibase
Creating Modular Test-Driven SPAs with Spring and AngularJS
Creating Modular Test-Driven SPAs with Spring and AngularJSCreating Modular Test-Driven SPAs with Spring and AngularJS
Creating Modular Test-Driven SPAs with Spring and AngularJS

Viewers also liked

July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
Yahoo Developer Network
Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive

Cloudera, Inc.
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieBuilding and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
DataWorks Summit/Hadoop Summit
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
A Basic Hive Inspection
A Basic Hive InspectionA Basic Hive Inspection
A Basic Hive Inspection
Linda Tillman
Hive tuning
Hive tuningHive tuning
Hive tuning
Michael Zhang
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
Zheng Shao
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
Yahoo Developer Network

Viewers also liked (8)

July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive

Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache OozieBuilding and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
A Basic Hive Inspection
A Basic Hive InspectionA Basic Hive Inspection
A Basic Hive Inspection
Hive tuning
Hive tuningHive tuning
Hive tuning
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache OozieAugust 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie

Similar to Oozie meetup - HA

What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
DataWorks Summit
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
Joe Crobak
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
DataWorks Summit
Breathing new life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow ManagerBreathing new life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow Manager
Artem Ervits
Before OTD EDU - Introduction
Before OTD EDU - IntroductionBefore OTD EDU - Introduction
Before OTD EDU - Introduction
Beom Lee
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
DataWorks Summit
What's new in chef 12
What's new in chef 12 What's new in chef 12
What's new in chef 12
Charles Johnson
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
DataWorks Summit
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
DataWorks Summit
Oracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new featuresOracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new features
Maarten Smeets
One daytalk hbraun_oct2011
One daytalk hbraun_oct2011One daytalk hbraun_oct2011
One daytalk hbraun_oct2011
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
Hadoop Oozie
Hadoop OozieHadoop Oozie
Hadoop Oozie
Madhur Nawandar
Oracle Fusion Middleware provisioning with Puppet
Oracle Fusion Middleware provisioning with PuppetOracle Fusion Middleware provisioning with Puppet
Oracle Fusion Middleware provisioning with Puppet
Edwin Biemond
Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1
8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab
Fabio Fumarola

Similar to Oozie meetup - HA (20)

What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?What is New in Apache Hive 3.0?
What is New in Apache Hive 3.0?
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Hive 3 New Horizons DataWorks Summit Melbourne February 2019
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow ManagerBreathing new life into Apache Oozie with Apache Ambari Workflow Manager
Breathing new life into Apache Oozie with Apache Ambari Workflow Manager
Before OTD EDU - Introduction
Before OTD EDU - IntroductionBefore OTD EDU - Introduction
Before OTD EDU - Introduction
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow ManagerBreathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
What's new in chef 12
What's new in chef 12 What's new in chef 12
What's new in chef 12
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
Oracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new featuresOracle SOA Suite 12.2.1 new features
Oracle SOA Suite 12.2.1 new features
One daytalk hbraun_oct2011
One daytalk hbraun_oct2011One daytalk hbraun_oct2011
One daytalk hbraun_oct2011
Apache Oozie
Apache OozieApache Oozie
Apache Oozie
Hadoop Oozie
Hadoop OozieHadoop Oozie
Hadoop Oozie
Oracle Fusion Middleware provisioning with Puppet
Oracle Fusion Middleware provisioning with PuppetOracle Fusion Middleware provisioning with Puppet
Oracle Fusion Middleware provisioning with Puppet
Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3Node object and roles - Fundamentals Webinar Series Part 3
Node object and roles - Fundamentals Webinar Series Part 3
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1
8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab8b. Column Oriented Databases Lab
8b. Column Oriented Databases Lab

Recently uploaded

System Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th editionSystem Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th edition
Future Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari ItalyFuture Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari Italy
University of Hertfordshire
,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي
,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي
,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي
Defect Elimination Management - CMMS Success.pdf
Defect Elimination Management - CMMS Success.pdfDefect Elimination Management - CMMS Success.pdf
Defect Elimination Management - CMMS Success.pdf
David Johnston
02 - Method Statement for Concrete pouring.docx
02 - Method Statement for Concrete pouring.docx02 - Method Statement for Concrete pouring.docx
02 - Method Statement for Concrete pouring.docx
software engineering software engineering
software engineering software engineeringsoftware engineering software engineering
software engineering software engineering
FARM POND AND Percolation POND BY agri studentsPPT.pptx
FARM POND AND Percolation POND BY agri studentsPPT.pptxFARM POND AND Percolation POND BY agri studentsPPT.pptx
FARM POND AND Percolation POND BY agri studentsPPT.pptx
Database management system module -3 bcs403
Database management system module -3 bcs403Database management system module -3 bcs403
Database management system module -3 bcs403
sensor networks unit wise 4 ppt units ppt
sensor networks unit wise 4  ppt units pptsensor networks unit wise 4  ppt units ppt
sensor networks unit wise 4 ppt units ppt
Sea Wave Energy - Renewable Energy Resources
Sea Wave Energy - Renewable Energy ResourcesSea Wave Energy - Renewable Energy Resources
Sea Wave Energy - Renewable Energy Resources
Protect YugabyteDB with Hashicorp Vault.pdf
Protect YugabyteDB with Hashicorp Vault.pdfProtect YugabyteDB with Hashicorp Vault.pdf
Protect YugabyteDB with Hashicorp Vault.pdf
Gwenn Etourneau
Structural Dynamics and Earthquake Engineering
Structural Dynamics and Earthquake EngineeringStructural Dynamics and Earthquake Engineering
Structural Dynamics and Earthquake Engineering
The Pennsylvania State University degree Cert diploma offer
The Pennsylvania State University degree Cert diploma offerThe Pennsylvania State University degree Cert diploma offer
The Pennsylvania State University degree Cert diploma offer
Dar es Salaam, Tanzania
Bell Crank Lever.pptxDesign of Bell Crank Lever
Bell Crank Lever.pptxDesign of Bell Crank LeverBell Crank Lever.pptxDesign of Bell Crank Lever
Bell Crank Lever.pptxDesign of Bell Crank Lever
Mobile Forensics challenges and Extraction process
Mobile Forensics challenges and Extraction processMobile Forensics challenges and Extraction process
Mobile Forensics challenges and Extraction process
Swapnil Gharat
internship project presentation for reference.pptx
internship project presentation for reference.pptxinternship project presentation for reference.pptx
internship project presentation for reference.pptx
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...

Recently uploaded (20)

System Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th editionSystem Analysis and Design in a changing world 5th edition
System Analysis and Design in a changing world 5th edition
Future Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari ItalyFuture Networking v Energy Limits ICTON 2024 Bari Italy
Future Networking v Energy Limits ICTON 2024 Bari Italy
,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي
,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي
,*$/?!~00971508021841^(سعر حبوب الإجهاض في دبي
Defect Elimination Management - CMMS Success.pdf
Defect Elimination Management - CMMS Success.pdfDefect Elimination Management - CMMS Success.pdf
Defect Elimination Management - CMMS Success.pdf
02 - Method Statement for Concrete pouring.docx
02 - Method Statement for Concrete pouring.docx02 - Method Statement for Concrete pouring.docx
02 - Method Statement for Concrete pouring.docx
software engineering software engineering
software engineering software engineeringsoftware engineering software engineering
software engineering software engineering
FARM POND AND Percolation POND BY agri studentsPPT.pptx
FARM POND AND Percolation POND BY agri studentsPPT.pptxFARM POND AND Percolation POND BY agri studentsPPT.pptx
FARM POND AND Percolation POND BY agri studentsPPT.pptx
Database management system module -3 bcs403
Database management system module -3 bcs403Database management system module -3 bcs403
Database management system module -3 bcs403
sensor networks unit wise 4 ppt units ppt
sensor networks unit wise 4  ppt units pptsensor networks unit wise 4  ppt units ppt
sensor networks unit wise 4 ppt units ppt
Sea Wave Energy - Renewable Energy Resources
Sea Wave Energy - Renewable Energy ResourcesSea Wave Energy - Renewable Energy Resources
Sea Wave Energy - Renewable Energy Resources
Protect YugabyteDB with Hashicorp Vault.pdf
Protect YugabyteDB with Hashicorp Vault.pdfProtect YugabyteDB with Hashicorp Vault.pdf
Protect YugabyteDB with Hashicorp Vault.pdf
Structural Dynamics and Earthquake Engineering
Structural Dynamics and Earthquake EngineeringStructural Dynamics and Earthquake Engineering
Structural Dynamics and Earthquake Engineering
The Pennsylvania State University degree Cert diploma offer
The Pennsylvania State University degree Cert diploma offerThe Pennsylvania State University degree Cert diploma offer
The Pennsylvania State University degree Cert diploma offer
Bell Crank Lever.pptxDesign of Bell Crank Lever
Bell Crank Lever.pptxDesign of Bell Crank LeverBell Crank Lever.pptxDesign of Bell Crank Lever
Bell Crank Lever.pptxDesign of Bell Crank Lever
Mobile Forensics challenges and Extraction process
Mobile Forensics challenges and Extraction processMobile Forensics challenges and Extraction process
Mobile Forensics challenges and Extraction process
internship project presentation for reference.pptx
internship project presentation for reference.pptxinternship project presentation for reference.pptx
internship project presentation for reference.pptx
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...
The Transformation Risk-Benefit Model of Artificial Intelligence: Balancing R...

Oozie meetup - HA

  • 1. 1 Oozie High Availability (HA) Robert Kanter
  • 2. 2 High Availability • A system without non-planned downtime when partial failures occur • Typically achieved by having redundancies and removing single-points of failure • Our Goals • Don’t change the API or usage patterns • User doesn’t even have to know its HA
  • 4. 4 The HA Solution: Database • Oozie stores all state in a database • (submitted jobs, workflow definitions, etc) • Instead of a failover model, we want to run many Oozie servers against the same database • Active-Active HA • Also provides horizontal scalability • ZooKeeper for coordination
  • 6. 6 The HA Solution: Access • Users and client programs need a single address to connect (Web UI, REST/Java API, JobTracker callbacks, etc) • Load Balancer, Virtual IP, or DNS round-robin can be used to provide a single entry point to the Oozie servers • Technically also needs to be HA
  • 8. 8 The HA Solution: Log Streaming • Oozie’s log files are not in the database • Each Oozie Server only has access to its own logs • Jobs are not assigned to a specific Oozie server • What if Oozie Server A wants to get logs for a job processed by Oozie Server B? • Oozie Server A can ask Oozie Server B for its logs • Caveat: If an Oozie Server goes down, any logs from it will be unavailable until it is brought back up
  • 9. 9 The HA Solution: Log Streaming
  • 10. 10 How to Enable HA Configuration and Security
  • 11. 11 How to Enable HA • Setup Load balancer, ZooKeeper ensemble, HA database, and multiple identically configured Oozie servers • Enable Oozie HA services: <property> <name></name> <value> org.apache.oozie.service.ZKLocksService, org.apache.oozie.service.ZKXLogStreamingService, org.apache.oozie.service.ZKJobsConcurrencyService </value> </property>
  • 12. 12 How to Enable HA • Point Oozie to ZooKeeper Ensemble: <property> <name>oozie.zookeeper.connection.string</name> <value>ZK_HOST1:2181,ZK_HOST2:2181</value> </property> • Point environment variable for callbacks to load balancer: export OOZIE_BASE_URL="http://loadbalancer:11000/oozie"
  • 13. 13 How to Enable HA: Security • Extra step to configure Kerberos with Load Balancer: <property> <name> oozie.authentication.kerberos.principal </name> <value>HTTP/loadbalancer@REALM</value> </property> • Note: this currently prevents clients from talking directly to any Oozie server
  • 14. 14 How to Enable HA: Security • Enable Kerberos connection to ZooKeeper and ACLs: <property> <name></name> <value>true</value> </property> • ACLs prevent malicious users or programs from interfering with Oozie’s znodes
  • 16. 16 Using Oozie with HA • New Oozie CLI/REST API command to list all servers $ oozie admin -oozie http://loadbalancer:11000/oozie -servers hostA : http://hostA:11000/oozie hostB : http://hostB:11000/oozie hostC : http://hostC:11000/oozie • Log messages now include which server wrote them 2013-09-29 16:46:20,182 WARN SERVER[hostA] USER[root] GROUP[-] TOKEN[] APP[demo-wf] JOB[0000000-130925230553293-oozie-oozi-W] ACTION[0000000- 130925230553293-oozie-oozi-W@streaming-node] [***0000000- 130925230553293-oozie-oozi-W@streaming-node***]Action status=RUNNING
  • 18. 18 To Do • HA support for SLAs and HCatalog integration • Sharelib Purging with HA • Log Streaming HA • With Kerberos, Oozie servers can’t talk to each other • Breaks log streaming, sharelibupdate • Other misc improvements
  • 19. 19