This document summarizes a talk about Facebook's use of HBase for messaging data. It discusses how Facebook migrated data from MySQL to HBase to store metadata, search indexes, and small messages in HBase for improved scalability. It also outlines performance improvements made to HBase, such as for compactions and reads, and future plans such as cross-datacenter replication and running HBase in a multi-tenant environment.
The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Ser...Databricks
We present the Azure Cognitive Services on Spark, a simple and easy to use extension of the SparkML Library to all Azure Cognitive Services. This integration allows Spark Users to embed cloud intelligence directly into their spark computations, enabling a new generation of intelligent applications on Spark. Furthermore, we show that with our new Containerized Cognitive Services, one can embed cloud intelligence directly into the Spark cluster for ultra-low latency, on-prem, and offline applications. We show how using our Integration, one can compose these cognitive services with other services, SQL computations, and Deep Networks to create sophisticated and intelligent heterogenous applications. Moreover, we show how to redeploy these compositions as Restful Services with Spark Serving. We will also explore the architecture of these contributions which leverage HTTP on Spark, a novel integration between Spark with the widely used Hypertext Transfer Protocol (HTTP). This library can integrate any framework into the Spark ecosystem that is capable of communicating through HTTP. Finally, we demonstrate how to use these services to create a large class of intelligent applications such as custom search engines, realtime facial recognition systems, and unsupervised object detectors.
GCP Meetup #3 - Approaches to Cloud Native Architecturesnine
Talk by Daniel Leahy and Nic Gibson, given at the Google Cloud Meetup on March 3, 2020, hosted by Nine Internet Solutions AG - Your Swiss Managed Cloud Service Provider.
Hadoop Infrastructure @Uber Past, Present and FutureDataWorks Summit
Uber’s mission is to provide transportation as reliable as running water and for fulfilling that mission data plays a critical role. In Uber, Hadoop plays a critical role in Data Infrastructure. We want to talk about the journey of Hadoop @Uber and our future plans in terms of scaling for billions of trips. We will talk about most unique use case Uber have and how Hadoop and eco system which we built, helped us in this journey. We want to talk about how we scaled from 10 -> 2000 and In future to scale up to 10’s X1000 of Nodes. We will talk about our mistakes, learning and wins and how we process billions of events per day. We will talk about the unique challenges and real world use-cases and how we will co-locate the Uber’s service architecture with batch (e.g data pipelines, machine learning and analytical workloads). Uber have done lot of improvements to current Hadoop eco system and uniquely solved some of the problems in a way which is never been solved in the past. This presentation will help audience to use this as an example and even encourage them to enhance the eco system. This will help to increase the community of these project and overall help the whole big data space. Audience is anybody who is working on Big Data and want to understand how to scale Hadoop and eco system for 10s of thousands of node. This talk will help them understand the Hadoop ecosystem and how to efficiently use that. It will also introduce them to some of the awesome technologies which Uber team is building in big data space.
A talk given by Ted Dunning on February 2013 on Apache Drill, an open-source community-driven project to provide easy, dependable, fast and flexible ad hoc query capabilities.
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...DataStax Academy
The state of analytics has changed dramatically over the last few years. Hadoop is now commonplace, and the ecosystem has evolved to include new tools such as Spark, Shark, and Drill, that live alongside the old MapReduce-based standards. It can be difficult to keep up with the pace of change, and newcomers are left with a dizzying variety of seemingly similar choices. This is compounded by the number of possible deployment permutations, which can cause all but the most determined to simply stick with the tried and true. But there are serious advantages to many of the new tools, and this presentation will give an analysis of the current state–including pros and cons as well as what’s needed to bootstrap and operate the various options.
About Robbie Strickland, Software Development Manager at The Weather Channel
Robbie works for The Weather Channel’s digital division as part of the team that builds backend services for weather.com and the TWC mobile apps. He has been involved in the Cassandra project since 2010 and has contributed in a variety of ways over the years; this includes work on drivers for Scala and C#, the Hadoop integration, heading up the Atlanta Cassandra Users Group, and answering lots of Stack Overflow questions.
http://bit.ly/1BTaXZP – Hadoop has been a huge success in the data world. It’s disrupted decades of data management practices and technologies by introducing a massively parallel processing framework. The community and the development of all the Open Source components pushed Hadoop to where it is now.
That's why the Hadoop community is excited about Apache Spark. The Spark software stack includes a core data-processing engine, an interface for interactive querying, Sparkstreaming for streaming data analysis, and growing libraries for machine-learning and graph analysis. Spark is quickly establishing itself as a leading environment for doing fast, iterative in-memory and streaming analysis.
This talk will give an introduction the Spark stack, explain how Spark has lighting fast results, and how it complements Apache Hadoop.
Keys Botzum - Senior Principal Technologist with MapR Technologies
Keys is Senior Principal Technologist with MapR Technologies, where he wears many hats. His primary responsibility is interacting with customers in the field, but he also teaches classes, contributes to documentation, and works with engineering teams. He has over 15 years of experience in large scale distributed system design. Previously, he was a Senior Technical Staff Member with IBM, and a respected author of many articles on the WebSphere Application Server as well as a book.
Maintaining Low Latency While Maximizing Throughput on a Single ClusterMapR Technologies
The good news: Hadoop has a lot of tools. The bad news: Hadoop has a lot of tools, and conflicting priorities. This talk shows how advances in YARN and Mesos allow you to run multiple distinct workloads together. We show how to use SLA and latency rules along with preemption in YARN to maintain high throughput while guaranteeing latency for applications such as HBase and Drill
These slides provide highlights of my book HDInsight Essentials. Book link is here: http://www.packtpub.com/establish-a-big-data-solution-using-hdinsight/book
Spark and Spark Streaming can process streaming data using a technique called Discretized Streams (D-Streams) that divides the data into small batch intervals. This allows Spark to provide fault tolerance through checkpointing and recovery of state across intervals. Spark Streaming also introduces the concept of "exactly-once" processing semantics through checkpointing and write ahead logs. Spark Structured Streaming builds on these concepts and adds SQL support and watermarking to allow incremental processing of streaming data.
Application architectures with Hadoop – Big Data TechCon 2014hadooparchbook
Building applications using Apache Hadoop with a use-case of clickstream analysis. Presented by Mark Grover and Jonathan Seidman at Big Data TechCon, Boston in April 2014
Building a Business on Hadoop, HBase, and Open Source Distributed ComputingBradford Stephens
This is a talk on a fundamental approach to thinking about scalability, and how Hadoop, HBase, and Lucene are enabling companies to process amazing amounts of data. It's also about how Social Media is making the traditional RDBMS irrelevant.
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
Element Fleet has the largest benchmark database in our industry and we needed a robust and linearly scalable platform to turn this data into actionable insights for our customers. The platform needed to support advanced analytics, streaming data sets, and traditional business intelligence use cases.
In this presentation, we will discuss how we built a single, unified platform for both Advanced Analytics and traditional Business Intelligence using Cassandra on DSE. With Cassandra as our foundation, we are able to plug in the appropriate technology to meet varied use cases. The platform we’ve built supports real-time streaming (Spark Streaming/Kafka), batch and streaming analytics (PySpark, Spark Streaming), and traditional BI/data warehousing (C*/FiloDB). In this talk, we are going to explore the entire tech stack and the challenges we faced trying support the above use cases. We will specifically discuss how we ingest and analyze IoT (vehicle telematics data) in real-time and batch, combine data from multiple data sources into to single data model, and support standardized and ah-hoc reporting requirements.
About the Speaker
Jim Peregord Vice President - Analytics, Business Intelligence, Data Management, Element Corp.
The NameNode was experiencing high load and instability after being restarted. Graphs showed unknown high load between checkpoints on the NameNode. DataNode logs showed repeated 60000 millisecond timeouts in communication with the NameNode. Thread dumps revealed NameNode server handlers waiting on the same lock, indicating a bottleneck. Source code analysis pointed to repeated block reports from DataNodes to the NameNode as the likely cause of the high load.
This document discusses loading data from Hadoop into Oracle databases using Oracle connectors. It describes how the Oracle Loader for Hadoop and Oracle SQL Connector for HDFS can load data from HDFS into Oracle tables much faster than traditional methods like Sqoop by leveraging parallel processing in Hadoop. The connectors optimize the loading process by automatically partitioning, sorting, and formatting the data into Oracle blocks to achieve high performance loads. Measuring the CPU time needed per gigabyte loaded allows estimating how long full loads will take based on available resources.
Summary of recent progress on Apache Drill, an open-source community-driven project to provide easy, dependable, fast and flexible ad hoc query capabilities.
Unified Batch & Stream Processing with Apache SamzaDataWorks Summit
The traditional lambda architecture has been a popular solution for joining offline batch operations with real time operations. This setup incurs a lot of developer and operational overhead since it involves maintaining code that produces the same result in two, potentially different distributed systems. In order to alleviate these problems, we need a unified framework for processing and building data pipelines across batch and stream data sources.
Based on our experiences running and developing Apache Samza at LinkedIn, we have enhanced the framework to support: a) Pluggable data sources and sinks; b) A deployment model supporting different execution environments such as Yarn or VMs; c) A unified processing API for developers to work seamlessly with batch and stream data. In this talk, we will cover how these design choices in Apache Samza help tackle the overhead of lambda architecture. We will use some real production use-cases to elaborate how LinkedIn leverages Apache Samza to build unified data processing pipelines.
Speaker
Navina Ramesh, Sr. Software Engineer, LinkedIn
I gave this talk on the Highload++ conference 2015 in Moscow. Slides have been translated into English. They cover the Apache HAWQ components, its architecture, query processing logic, and also competitive information
Exponea - Kafka and Hadoop as components of architectureMartinStrycek
Kafka and Hadoop were introduced at Exponea to address several issues:
- The in-memory database was very fast but limited by memory constraints. Customers wanted the freedom to analyze all their data.
- Processing large volumes of streaming data was problematic.
- HDFS does not support appending new data files. Kafka was introduced to stream data for storage in Hadoop.
- The new technologies introduced monitoring challenges for the expanded data stack.
Architectural considerations for Hadoop Applicationshadooparchbook
The document discusses architectural considerations for Hadoop applications using a case study on clickstream analysis. It covers requirements for data ingestion, storage, processing, and orchestration. For data storage, it considers HDFS vs HBase, file formats, and compression formats. SequenceFiles are identified as a good choice for raw data storage as they allow for splittable compression.
Facebook uses HBase running on HDFS to store messaging data and metadata. Key reasons for choosing HBase include high write throughput, horizontal scalability, and integration with HDFS. Typical clusters have multiple regions and racks for redundancy. Facebook stores small messages, metadata, and attachments in HBase, while larger messages and attachments are stored separately. The system processes billions of read and write operations daily and continues to optimize performance and reliability.
Building Mission Critical Messaging System On Top Of HBase
Facebook chose HBase as the storage system for its messaging platform due to HBase's high write throughput, good random read performance, horizontal scalability, and automatic failover. Facebook stores messages, metadata, and search indices in HBase. To improve performance and reliability, Facebook developed the system on a production-stabilized branch of HBase, used shadow testing, added extensive monitoring, and contributed improvements back to the HBase community.
This document provides an overview of HBase, including:
- HBase is a distributed, scalable, big data store modeled after Google's BigTable. It provides a fault-tolerant way to store large amounts of sparse data.
- HBase is used by large companies to handle scaling and sparse data better than relational databases. It features automatic partitioning, linear scalability, commodity hardware, and fault tolerance.
- The document discusses HBase operations, schema design best practices, hardware recommendations, alerting, backups and more. It provides guidance on designing keys, column families and cluster configuration to optimize performance for read and write workloads.
Introduction to HBase. HBase is a NoSQL databases which experienced a tremendous increase in popularity during the last years. Large companies like Facebook, LinkedIn, Foursquare are using HBase. In this presentation we will address questions like: what is HBase?, and compared to relational databases?, what is the architecture?, how does HBase work?, what about the schema design?, what about the IT ressources?. Questions that should help you consider whether this solution might be suitable in your case.
This document contains information about HBase concepts and configurations. It discusses different modes of HBase operation including standalone, pseudo-distributed, and distributed modes. It also covers basic prerequisites for running HBase like Java, SSH, DNS, NTP, ulimit settings, and Hadoop for distributed mode. The document explains important HBase configuration files like hbase-site.xml, hbase-default.xml, hbase-env.sh, log4j.properties, and regionservers. It provides details on column-oriented versus row-oriented databases and discusses optimizations that can be made through configuration settings.
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
The document summarizes two presentations about using HBase as a database. It discusses the speakers' experiences using HBase at Stumbleupon and Streamy to replace MySQL and other relational databases. Some key points covered include how HBase provides scalability, flexibility, and cost benefits over SQL databases for large datasets.
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
An overview of the history of Big Data, followed by a deep dive into the Hadoop ecosystem. Detailed explanation of how HDFS, MapReduce, and HBase work, followed by a discussion of how to tune HBase performance. Finally, a look at industry trends, including challenges faced and being solved by Bloomberg for using Hadoop for financial data.
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
An overview of the history of Big Data, followed by a deep dive into the Hadoop ecosystem. Detailed explanation of how HDFS, MapReduce, and HBase work, followed by a discussion of how to tune HBase performance. Finally, a look at industry trends, including challenges faced and being solved by Bloomberg for using Hadoop for financial data.
The document provides an introduction to NoSQL and HBase. It discusses what NoSQL is, the different types of NoSQL databases, and compares NoSQL to SQL databases. It then focuses on HBase, describing its architecture and components like HMaster, regionservers, Zookeeper. It explains how HBase stores and retrieves data, the write process involving memstores and compaction. It also covers HBase shell commands for creating, inserting, querying and deleting data.
The document discusses how HDFS architecture has evolved to meet new requirements for higher scalability, availability, and improved random read performance. It summarizes the key aspects of HDFS architecture in 2010, including limitations, and improvements made since then, such as read pipeline optimizations, federated namespaces, and high availability name nodes. It also outlines future directions for HDFS architecture.
At StampedeCon 2012 in St. Louis, Pritam Damania presents: Reliable backup and recovery is one of the main requirements for any enterprise grade application. HBase has been very well embraced by enterprises needing random, real-time read/write access with huge volumes of data and ease of scalability. As such, they are looking for backup solutions that are reliable, easy to use, and can co-exist with existing infrastructure. HBase comes with several backup options but there is a clear need to improve the native export mechanisms. This talk will cover various options that are available out of the box, their drawbacks and what various companies are doing to make backup and recovery efficient. In particular it will cover what Facebook has done to improve performance of backup and recovery process with minimal impact to production cluster.
Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...Yahoo Developer Network
The document discusses different approaches for searching large datasets in Hadoop, including MapReduce, Lucene/Solr, and building a new search engine called HSearch. Some key challenges with existing approaches included slow response times and the need for manual sharding. HSearch indexes data stored in HDFS and HBase. The document outlines several techniques used in HSearch to improve performance, such as using SSDs selectively, reducing HBase table size, distributing queries across region servers, moving processing near data, byte block caching, and configuration tuning. Benchmarks showed HSearch could return results for common words from a 100 million page index within seconds.
Data Storage and Management project ReportTushar Dalvi
This paper aims at evaluating the performance of random reads and random writes the information of HBase and Cassandra and compare the results that we got through various ubuntu operation
Speaker: Varun Sharma (Pinterest)
Over the past year, HBase has become an integral component of Pinterest's storage stack. HBase has enabled us to quickly launch and iterate on new products and create amazing pinner experiences. This talk briefly describes some of these applications, the underlying schema, and how our HBase setup stays highly available and performant despite billions of requests every week. It will also include some performance tips for running on SSDs. Finally, we will talk about a homegrown serving technology we built from a mashup of HBase components that has gained wide adoption across Pinterest.
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPERKrishnaVeni451953
HBase is an open source, column-oriented database built on top of Hadoop that allows for the storage and retrieval of large amounts of sparse data. It provides random real-time read/write access to this data stored in Hadoop and scales horizontally. HBase features include automatic failover, integration with MapReduce, and storing data as multidimensional sorted maps indexed by row, column, and timestamp. The architecture consists of a master server (HMaster), region servers (HRegionServer), regions (HRegions), and Zookeeper for coordination.
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
The document summarizes HBase use at Facebook, including its development and future work. HBase is used for incremental updates to data warehouses, high frequency analytics, and write-intensive workloads. Development includes Hive integration, master high availability, and random read optimizations. Future work focuses on coprocessors, intelligent load balancing, and cluster performance.
The document discusses backup and disaster recovery strategies for Hadoop. It focuses on protecting data sets stored in HDFS. HDFS uses data replication and checksums to protect against disk and node failures. Snapshots can protect against data corruption and accidental deletes. The document recommends copying data from the primary to secondary site for disaster recovery rather than teeing, and discusses considerations for large data movement like bandwidth needs and security. It also notes the importance of backing up metadata like Hive configurations along with core data.
Hbase status quo apache-con europe - nov 2012Chris Huang
The document summarizes the status of HBase and its relationship with HDFS. In the past, HDFS did not prioritize HBase's needs, but reliability, availability, and performance have improved with Hadoop 1.0 and 2.0. Hadoop 2.0 features like HDFS high availability and wire compatibility directly benefit HBase. Further improvements planned for Hadoop 2.x like direct reads and zero-copy support could significantly boost HBase performance. The HBase project is also advancing with new versions focused on features like coprocessors and performance optimizations.
The document provides an overview of big data and Hadoop fundamentals. It discusses what big data is, the characteristics of big data, and how it differs from traditional data processing approaches. It then describes the key components of Hadoop including HDFS for distributed storage, MapReduce for distributed processing, and YARN for resource management. HDFS architecture and features are explained in more detail. MapReduce tasks, stages, and an example word count job are also covered. The document concludes with a discussion of Hive, including its use as a data warehouse infrastructure on Hadoop and its query language HiveQL.
The document discusses different topics related to software development processes and tools. It provides information about Scrum methodology and roles like Product Manager and development teams. It also talks about version control tools like SVN and continuous integration tools like Hudson. Various software development concepts are explained like trunk-based development, feature flags, and deploying features to production in phases. Overall workflows involving coding, code reviews, testing and deploying software updates are described.
This document summarizes new features in Spring 3 and 3.1 for component-based application design. Spring 3 focuses on annotation-based components while also supporting concise XML configurations. Key features include stereotypes, factory methods, expression language support, standardized annotations, validation, formatting, scheduling, and REST support. Spring 3.1 enhances environments with profiles for bean definitions, enables Java-based configuration, adds a "c:" namespace for XML, and introduces declarative caching capabilities.
The document discusses Netflix's cloud architecture on Amazon Web Services (AWS). It aims to be faster, scalable, available and allow developers to work more productively. Some key points are moving from a central SQL database to distributed NoSQL stores, replacing sticky in-memory sessions with a shared cache, and optimizing for latency tolerance over chatty protocols. The architecture also focuses on layered service interfaces over tangled code and instrumenting services rather than code.
This document discusses Google's infrastructure and data centers. It describes Google's use of large data centers containing thousands of servers and petabytes of storage. It also summarizes Google's development of technologies like GFS, MapReduce, and BigTable to handle massive amounts of data across their infrastructure. Key details are provided on hardware specifications, network switches, reliability targets, and the engineers involved in developing Google's data-handling systems.
Netflix uses cloud computing to address challenges in scaling its infrastructure to support unpredictable growth. It has transitioned its website to be nearly 100% cloud-based using Amazon Web Services (AWS) to gain the scale, availability and agility needed. AWS provides tools and features like auto-scaling that allow Netflix to easily expand capacity as its subscriber base grows by over 50% per year. By leveraging AWS' mature cloud platform, Netflix can focus on its core video business rather than managing data centers.
The document discusses domain-driven design and modeling complex domains. It provides an example of modeling a shipping domain to understand cargo routing. Entities in the domain include Cargo, Itinerary, and Leg. A Cargo has an origin and destination. An Itinerary is generated by a Routing Service and consists of a series of Legs, where each Leg specifies a load and unload location for the Cargo. Modeling these concepts helps address routing needs like booking or rerouting shipments.
This document discusses how to create "big agility" by focusing on goals and outcomes rather than processes. It advocates questioning assumptions and continuously learning through experiments. Key points discussed include developing personas and story maps to understand users' needs, planning iterations to balance discovery, delivery and learning, and measuring real value delivered rather than effort spent. Cross-team collaboration and creating a shared understanding of what success means for stakeholders is also emphasized. The document provides examples of tools and practices for building agility within, across, and outside of teams.
6. Monthly data volume prior to launch
15B x 1,024 bytes = 14TB
120B x 100 bytes = 11TB
7. Messaging Data
▪ Small/medium sized data HBase
▪ Message metadata & indices
▪ Search index
▪ Small message bodies
▪ Attachments and large messages Haystack
▪ Used for our existing photo/video store
8. Open Source Stack
▪ Memcached --> App Server Cache
▪ ZooKeeper --> Small Data Coordination Service
▪ HBase --> Database Storage Engine
▪ HDFS --> Distributed FileSystem
▪ Hadoop --> Asynchronous Map-Reduce Jobs
9. Our architecture
User Directory Service
Clients
(Front End, MTA, etc.)
What’s the cell for
this user?
Cell 2
Cell 1 Cell 3
Cell 1 Application Server
Application Server Application Server
Attachments
HBase/HDFS/Z
HBase/HDFS/Z KMessage, Metadata,
HBase/HDFS/Z
K Search Index
K
Haystack
11. HBase in a nutshell
• distributed, large-scale data store
• efficient at random reads/writes
• initially modeled after Google’s BigTable
• open source project (Apache)
12. When to use HBase?
▪ storing large amounts of data
▪ need high write throughput
▪ need efficient random access within large data sets
▪ need to scale gracefully with data
▪ for structured and semi-structured data
▪ don’t need full RDMS capabilities (cross table transactions, joins, etc.)
13. HBase Data Model
• An HBase table is:
• a sparse , three-dimensional array of cells, indexed by:
RowKey, ColumnKey, Timestamp/Version
• sharded into regions along an ordered RowKey space
• Within each region:
• Data is grouped into column families
▪ Sort order within each column family:
Row Key (asc), Column Key (asc), Timestamp (desc)
14. Example: Inbox Search
• Schema
• Key: RowKey: userid, Column: word, Version: MessageID
• Value: Auxillary info (like offset of word in message)
• Data is stored sorted by <userid, word, messageID>:
User1:hi:17->offset1
Can efficiently handle queries like:
User1:hi:16->offset2
User1:hello:16->offset3 - Get top N messageIDs for a
User1:hello:2->offset4 specific user & word
...
User2:.... - Typeahead query: for a given user,
User2:... get words that match a prefix
...
15. HBase System Overview
Database Layer
HBASE
Master Backup
Master
Region Region Region ...
Server Server Server
Coordination Service
Storage Layer
HDFS Zookeeper Quorum
Namenode Secondary Namenode ZK ZK ...
Peer Peer
Datanode Datanode Datanode ...
16. HBase Overview
HBASE Region Server
....
Region #2
Region #1
....
ColumnFamily #2
ColumnFamily #1 Memstore
(in memory data structure)
HFiles (in HDFS) flush
Write Ahead Log ( in HDFS)
17. HBase Overview
• Very good at random reads/writes
• Write path
• Sequential write/sync to commit log
• update memstore
• Read path
• Lookup memstore & persistent HFiles
• HFile data is sorted and has a block index for efficient retrieval
• Background chores
• Flushes (memstore -> HFile)
• Compactions (group of HFiles merged into one)
19. Horizontal scalability
▪ HBase & HDFS are elastic by design
▪ Multiple table shards (regions) per physical server
▪ On node additions
▪ Load balancer automatically reassigns shards from overloaded
nodes to new nodes
▪ Because filesystem underneath is itself distributed, data for
reassigned regions is instantly servable from the new nodes.
▪ Regions can be dynamically split into smaller regions.
▪ Pre-sharding is not necessary
▪ Splits are near instantaneous!
20. Automatic Failover
▪ Node failures automatically detected by HBase Master
▪ Regions on failed node are distributed evenly among surviving nodes.
▪ Multiple regions/server model avoids need for substantial
overprovisioning
▪ HBase Master failover
▪ 1 active, rest standby
▪ When active master fails, a standby automatically takes over
21. HBase uses HDFS
We get the benefits of HDFS as a storage system for free
▪ Fault tolerance (block level replication for redundancy)
▪ Scalability
▪ End-to-end checksums to detect and recover from corruptions
▪ Map Reduce for large scale data processing
▪ HDFS already battle tested inside Facebook
▪ running petabyte scale clusters
▪ lot of in-house development and operational experience
22. Simpler Consistency Model
▪ HBase’s strong consistency model
▪ simpler for a wide variety of applications to deal with
▪ client gets same answer no matter which replica data is read from
▪ Eventual consistency: tricky for applications fronted by a cache
▪ replicas may heal eventually during failures
▪ but stale data could remain stuck in cache
23. Typical Cluster Layout
▪ Multiple clusters/cells for messaging
▪ 20 servers/rack; 5 or more racks per cluster
▪ Controllers (master/Zookeeper) spread across racks
ZooKeeper Peer ZooKeeper Peer ZooKeeper Peer ZooKeeper Peer ZooKeeper Peer
HDFS Namenode Backup Namenode Job Tracker Hbase Master Backup Master
Region Server Region Server Region Server Region Server Region Server
Data Node Data Node Data Node Data Node Data Node
Task Tracker Task Tracker Task Tracker Task Tracker Task Tracker
19x... 19x... 19x... 19x... 19x...
Region Server Region Server Region Server Region Server Region Server
Data Node Data Node Data Node Data Node Data Node
Task Tracker Task Tracker Task Tracker Task Tracker Task Tracker
Rack #1 Rack #2 Rack #3 Rack #4 Rack #5
26. Goal of Zero Data Loss/Correctness
▪ sync support added to hadoop-20 branch
▪ for keeping transaction log (WAL) in HDFS
▪ to guarantee durability of transactions
▪ Row-level ACID compliance
▪ Enhanced HDFS’s Block Placement Policy:
▪ Original: rack aware, but minimally constrained
▪ Now: Placement of replicas constrained to configurable node groups
▪ Result: Data loss probability reduced by orders of magnitude
27. Availability/Stability improvements
▪ HBase master rewrite- region assignments using ZK
▪ Rolling Restarts – doing software upgrades without a downtime
▪ Interrupt Compactions – prioritize availability over minor perf gains
▪ Timeouts on client-server RPCs
▪ Staggered major compaction to avoid compaction storms
28. Performance Improvements
▪ Compactions
▪ critical for read performance
▪ Improved compaction algo
▪ delete/TTL/overwrite processing in minor compactions
▪ Read optimizations:
▪ Seek optimizations for rows with large number of cells
▪ Bloom filters to minimize HFile lookups
▪ Timerange hints on HFiles (great for temporal data)
▪ Improved handling of compressed HFiles
29. Operational Experiences
▪ Darklaunch:
▪ shadow traffic on test clusters for continuous, at scale testing
▪ experiment/tweak knobs
▪ simulate failures, test rolling upgrades
▪ Constant (pre-sharding) region count & controlled rolling splits
▪ Administrative tools and monitoring
▪ Alerts (HBCK, memory alerts, perf alerts, health alerts)
▪ auto detecting/decommissioning misbehaving machines
▪ Dashboards
▪ Application level backup/recovery pipeline
30. Working within the Apache community
▪ Growing with the community
▪ Started with a stable, healthy project
��� In house expertise in both HDFS and HBase
▪ Increasing community involvement
▪ Undertook massive feature improvements with community help
▪ HDFS 0.20-append branch
▪ HBase Master rewrite
▪ Continually interacting with the community to identify and fix issues
▪ e.g., large responses (2GB RPC)
33. Move messaging data from MySQL to HBase
▪ In MySQL, inbox data was kept normalized
▪ user’s messages are stored across many different machines
▪ Migrating a user is basically one big join across tables spread over
many different machines
▪ Multiple terabytes of data (for over 500M users)
▪ Cannot pound 1000s of production UDBs to migrate users
34. How we migrated
▪ Periodically, get a full export of all the users’ inbox data in MySQL
▪ And, use bulk loader to import the above into a migration HBase
cluster
▪ To migrate users:
▪ Since users may continue to receive messages during migration:
▪ double-write (to old and new system) during the migration period
▪ Get a list of all recent messages (since last MySQL export) for the
user
▪ Load new messages into the migration HBase cluster
▪ Perform the join operations to generate the new data
▪ Export it and upload into the final cluster
36. Facebook Insights Goes Real-Time
▪ Recently launched real-time analytics for social plugins on top of
HBase
▪ Publishers get real-time distribution/engagement metrics:
▪ # of impressions, likes
▪ analytics by
▪ Domain, URL, demographics
▪ Over various time periods (the last hour, day, all-time)
▪ Makes use of HBase capabilities like:
▪ Efficient counters (read-modify-write increment operations)
▪ TTL for purging old data
37. Future Work
It is still early days…!
▪ Namenode HA (AvatarNode)
▪ Fast hot-backups (Export/Import)
▪ Online schema & config changes
▪ Running HBase as a service (multi-tenancy)
▪ Features (like secondary indices, batching hybrid mutations)
▪ Cross-DC replication
▪ Lot more performance/availability improvements