SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Apache Kudu Webinar Series
Understanding and Unlocking
the Value of Real-Time Data
Ryan Lippert | Cloudera
Michele Goetz | Forrester (Special Guest)
2© Cloudera, Inc. All rights reserved.
Kudu Webinar Series
Part 1: Lambda Architectures – Simplified by Apache Kudu
A look into the potential trouble involved with a lambda architecture, and how Apache Kudu can
dramatically simplify real-time analytics.
Part 2: Extending the Capabilities of Operational and Analytical Databases
An examination of how Apache Kudu expands the set of use cases that Cloudera’s Operational and
Analytical databases can handle.
Part 3: Data-in-Motion: Unlock the Value of Real-Time Data
Forrester will discuss their research into real-time data pipelines and analytics, and Cloudera will
discuss how to make it a reality.
Part 4: Techincal Deep-Dive into Apache Kudu
An in-depth examination of the technical architecture and design of Apache Kudu, straight from a PMC
3© Cloudera, Inc. All rights reserved.
Updateable Analytic Storage
Simple real-time analytics and updates with Apache Kudu
Kudu: Storage for fast analytics on fast data
• Simplified architecture for building real-time analytic
• Designed for next-generation hardware for faster analytic
performance across frameworks
• Native Hadoop storage engine
Flexibility for the right tools for the right use
case in one platform
• Only analytic database for big data with Kudu + Impala
• Simple real-time applications with Kudu + Spark
Use cases
• Time series data
• Machine data analytics
• Online reporting
Kafka, Flume
Sentry, RecordService
Spark, Hive, Pig
Object Store
4© Cloudera, Inc. All rights reserved.
Ingest data of any
type or volume
Process data as it
Serve data to users
and applications
Real-Time Data
5© Cloudera, Inc. All rights reserved.
Drivers for agile, real-time data platforms
The key use cases that are driving businesses towards real time
Data on adoption trends for real-time technologies
What is Forrester seeing in the market for real-time technologies?
Deploying a real-time OSS achitecture to grow your business
How can you build a scalable, cost-effective platform to grow your
Michele Goetz
Special Guest Speaker
Principal Analyst Serving Enterprise Architecture Professionals
Drivers for agile, real-time data platforms
The key use cases that are driving businesses towards real time
Data on adoption trends for real-time technologies
What is Forrester seeing in the market for real-time technologies?
Deploying a real-time OSS achitecture to grow your business
How can you build a scalable, cost-effective platform to grow your
Superior CX depends on data and insights
Fraud and risk management requires real-time data
IoT heat map shows where data matters most, now
Data bottlenecks are catalysts for transition
Create a road map for a real-time, agile data platform
Drivers for agile, real-time data platforms
The key use cases that are driving businesses towards real time
Data on adoption trends for real-time technologies
What is Forrester seeing in the market for real-time technologies?
Deploying a real-time OSS achitecture to grow your business
How can you build a scalable, cost-effective platform to grow your
Leaders are focused on the technologies that allow data and
insights to be consumed across the organization
What are your firm's plans for the following data driven initiatives?
Base: 3005 global data and analytics decision-makers.
Source: Business Technographics® Global Data & Analytics Survey, 2016
Creating an organizational center of excellence for business intelligence
Combine content management and data management programs into a unified information management
Changing our processes to promote data stewardship and sharing
Investing in platforms to and share out data content
Creating a business led data stewardship or governance program
Changing management incentives to promote data sharing
Implementing analytics insights in software systems to aid customers or support employee decisions.
Investing more in business friendly, self-service visualization and analytics
Engaging external services providers or strategic business consultants for data and analytics or insights
Providing data preparation tools for self-service data management
Investing in distributed real time insight delivery technology
Expanding/Implemented Planning to implement within the next 12 months
Base: 325 global data and analytics technology decision-makers. “Don’t know” not shown.
Source: Business Technographics® Global Data & Analytics Survey, 2016
Which of the following describes your [TDM=”IT budget data and analytics technology or
services”; BDM=”business budget
for data and analytics technology or services”] from 2015 to 2016?
0% 5% 10% 15% 20% 25% 30% 35%
Decrease by 5% to 10%
Don’t know
Decrease by 1-4%
Increase by more than 10%
Increase by 5% to 10%
Increase by 1-4%
Stay about the same
54% of data and analytics technology decision-makers increased
their budgets for data and analytics from 2015 to 2016
Companies of all sizes are spending millions for data & analytics
Note: Don’t know excluded. Base: 765*, 1,288 global data and analytics decision makers
Source: Business Technographics® Global Data & Analytics Survey, 2016
Please estimate, in millions, how much your data and analytics budget is for 2016? (Note:
Number is in US Dollars)
1% 1% 0% 0%
2% 2% 1%
Less than $1 million $1 million to under $10
$10 million to under $100
$100 million to under
$500 million
$500 million to under $ 1
$1 billion to under $5
$5 billion or more
SMB (20-999 employees)*
Enterprise (1,000 or more employees)
Among the DM technologies Forrester tracks, interest for stream
processing tools has grown the most YoY
What are your firm's plans to use the following data management technologies?
Base: 2094 and *1805 global data and analytics technology decision-makers.
Source: Business Technographics® Global Data & Analytics Survey, 2016
% with
% with
interest, but
no immediate
+5 p.p. +3 p.p. -2 p.p. -1 p.p. -2 p.p. -3 p.p.
% with commitment (expanding, implemented, or planning to implement in the next 12 months)
63% 63%
60% 59%
64% 64%
61% 62%
58% 56%
Stream processing tools Inverted index database Distributed NoSQL
Hadoop Associative index
RDF, triple store
-20% -19% -19% -20% -19% -19%
-13% -13% -16% -14% -14% -13%
Base: Total: 2094
Source: Business Technographics® Global Data & Analytics Survey, 2016
Which of the following are included in your plans for big data?
NoSQL other than Hadoop
A MPP (massively parallel processing) data warehouse
Semantic technologies (ontology building, search, auto curation, graph, etc.)
Hadoop (including Hbase or Accumulo)
Data anonymization or de-identification
Creating or building out a data lake
Marketing or digital data management platforms and service providers that
brand their offerings as big data
Packaged analytics technologies that brand themselves as big data
Unstructured data mining / analytics
Distributed in memory databases, grids, analytics tools
Streaming analytics / computing
Large scale predictive modeling, data mining or other advanced analytics
Public cloud big data services
Streaming analytics high in the list of big data plans
19© Cloudera, Inc. All rights reserved.
Drivers for agile, real-time data platforms
The key use cases that are driving businesses towards real time
Data on adoption trends for real-time technologies
What is Forrester seeing in the market for real-time technologies?
Deploying a real-time OSS achitecture to grow your
How can you build a scalable, cost-effective platform to grow your
20© Cloudera, Inc. All rights reserved.
Trend Towards Real-Time Data Platforms is Clear
Drivers for Real-Time Platforms
• Enhancing customer experiences
• Risk Management
• Advancement of IoT and broader instrumentation
Adoption is Accelerating
• Top data-driven initiative by investment: distributed delivery of
real-time data
• DM technology with highest momentum: stream processing
• Top big data plans: streaming analytics is top 3
• Broad, large investments: 90% of decision makers are either
continuing or increasing their investments in data and analytics;
millions/billions being spent
21© Cloudera, Inc. All rights reserved.
The Underlying Driver
What drives a use case to real-time?
High Frequency Trading
APT Detection
Fraud Detection
Predictive Maintenance
Next Best Offer
Inventory Management
Shipping/Logistic Systems
CRM Systems
Employee Management
Strategic Planning
Real-time data management use cases are
defined by a common set of characteristics.
• Narrow time window in which to make a decision
(automated or manual)
• Opportunity for the data points to change the
decision path
• Decreasing value of data over time
Not all use cases have a pressing need for
real-time data.
• Broader strategic decisions, for example, do not
require real-time data input
• Over time, decreases in HW costs and increases in
availability of real-time systems will lead most use
cases to be conducted in real-time
Real Time
Some Latency
22© Cloudera, Inc. All rights reserved.
Moving to Real-Time and Leveraging Analytics
What do we have to gain?
“Monitoring System”
Sensors are automatically
monitored and
programmed to deliver
warnings when readings
are delivered outside of
an “optimal zone”.
Basic models developed
over small subsets of
“Predictive System”
Ingestion and processing
of all sensor data into an
unlimited data store with
analytic capabilities
enables machine
learning, which can
provide automated
optimization and
predictive maintenance.
“Only 1 percent of data from an oil rig with 30,000 sensors
is examined. The data that are used today are mostly for
anomaly detection and control, not optimization and
prediction, which provide the greatest value.”
- McKinsey & Company
Traditional Architectures Real-Time Analytic Capabilities
23© Cloudera, Inc. All rights reserved.
Ingest data of any
type or volume
Process data as it
Serve data to users
and applications
Real-Time Data
24© Cloudera, Inc. All rights reserved.
Ingestion at Cloudera
• Apache Sqoop for data from
relational databases
• Apache Flume for logs, event
based data
• Apache Kafka is fast,
scalable, and fault-tolerant
Partners, such as Streamsets,
provide rich visualization tools
Ingestion in Real-Time
Stream Ingestion is a Must for Many Use Cases
Ingestion isn’t just about internal business data anymore.
• Traditional ingestion was internally focused, and often a matter of
moving data from one silo or system to another
• Today, businesses aim to take in data from a variety of external
sources, IoT sensors, and machine-generated (user/network)
Your data journey can’t start until the data arrives.
• Each step of the ingest/process/serve data pipeline must occur
at real-time speed if decisions are to be made in time to affect
the course of business
Visualization help practitioners understand their data.
• Complex tasks can be made less complex via graphical
representations; data ingestion is no different
25© Cloudera, Inc. All rights reserved.
Stream Processing at Cloudera
Spark Streaming, the leading
open-source framework for real-
time use cases, is deployed in
Cloudera’s real-time
Cloudera has the broadest base
of Hadoop-adjacent experience
with Spark and integrating it
with Apache components.
Ingestion in Real-Time
Unlocking Value at Speed
For some use cases, batch just isn’t enough.
• Batch processing can lead to bottlenecks and delays in data
transformations that cause missed opportunities.
Apache Spark is gaining momentum for a reason.
• Leveraging Apache Spark for stream processing enables real-
time use cases with sub-second latency and best-in-class API’s.
Spark has a best-in-class ecosystem.
• Machine learning (via MLlib) is seamlessly integrated into Spark.
• Broadest set of vendors and contributors working on Spark
among available processing engines, leading to rapid innovation.
26© Cloudera, Inc. All rights reserved.
Data Serving at Cloudera
Apache Kudu provides batch
analysis and real-time serving within
the same storage layer
Apache HBase yields the best
read/write performance
Cloudera Search enables SQL-like
faceted search in natural language
Apache Kafka can be used to serve
data to applications and users
Serving in Real-Time
Inject Data into Real-Time Decisions
You need options that suit your use case.
• Platform proliferation hurts IT departments as skillsets are
divided; fewer platforms with broad capabilities help.
Apache Kudu changes the game for open source
• Combining real-time serving with analytic scans through a
relational database had taken a complex lambda architecture
until Kudu
• Together, simplification and affordability should drive more use
cases to real-time automated processes, in turn driving
increased revenue, decreased risk, and better service for
companies deploying Kudu
27© Cloudera, Inc. All rights reserved.
Fast Scans, Analytics
and Processing of
Stored Data
Fast On-Line
Updates &
Data Serving
Arbitrary Storage
(Active Archive)
Fast Analytics
(on fast-changing or
frequently-updated data)
Apache Kudu: Filling the Analytic Gap
Fast Changing
Frequent Updates
Kudu Kudu fills the Gap
Modern analytic
applications often
require complex data
flow & difficult
integration work to
move data between
HBase & HDFS
Pace of Analysis
28© Cloudera, Inc. All rights reserved.
Real-Time Data Analysis at Work
Customer 360  “Next Best Offer 2.0”
Spark MLlib
Individual Session
Full Model/Learning
Data Request Sent For Stream Processing
Data Cleaned/Ordered/Processed, Then
Delivered to Kudu for Modelling
User’s navigation returns the results they
are looking for, in addition to offers and
suggestions hyper-customized for them.
models will
likely have
29© Cloudera, Inc. All rights reserved.
Machine Learning
Kudu opens the door to machine learning
Kudu provides the ability
to leverage real-time
updates and analytic
scans together - critical for
many machine learning
Source: GHOSTS IN THE MACHINE: Artificial intelligence, risks and regulation in financial markets
30© Cloudera, Inc. All rights reserved.
The Time for
Real-Time Data
and Analytics
is Now.
And the platform for it
is Cloudera Enterprise.
31© Cloudera, Inc. All rights reserved.

More Related Content

What's hot

Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Cloudera, Inc.
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Cloudera, Inc.
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

Cloudera, Inc.
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
Cloudera, Inc.
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Cloudera, Inc.
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
Cloudera, Inc.
Live Cloudera Cybersecurity Solution Demo
Live Cloudera Cybersecurity Solution DemoLive Cloudera Cybersecurity Solution Demo
Live Cloudera Cybersecurity Solution Demo
Cloudera, Inc.
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning

Cloudera, Inc.
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

Cloudera, Inc.
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
Cloudera, Inc.
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Cloudera, Inc.
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
Cloudera, Inc.
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
Cloudera, Inc.
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
Cloudera, Inc.
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
Cloudera, Inc.

What's hot (20)

Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
Live Cloudera Cybersecurity Solution Demo
Live Cloudera Cybersecurity Solution DemoLive Cloudera Cybersecurity Solution Demo
Live Cloudera Cybersecurity Solution Demo
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning
Transforming Insurance Analytics with Big Data and Automated Machine Learning

Transforming Insurance Analytics with Big Data and Automated Machine Learning

Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things

Viewers also liked

Enabling the Connected Car Revolution

Enabling the Connected Car Revolution
Enabling the Connected Car Revolution

Enabling the Connected Car Revolution

Cloudera, Inc.
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

Cloudera, Inc.
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Cloudera, Inc.
Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
Cloudera, Inc.
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Cloudera, Inc.
Apache Beam
Apache Beam Apache Beam
Apache Beam
Adil Oulghard
Derechos de libertad
Derechos de libertadDerechos de libertad
Derechos de libertad
livinstong zerna
Kafka & Couchbase Integration Patterns
Kafka & Couchbase Integration PatternsKafka & Couchbase Integration Patterns
Kafka & Couchbase Integration Patterns
Manuel Hurtado
Vectores en r3
Vectores en r3Vectores en r3
Oe global2017 corti
Oe global2017 cortiOe global2017 corti
Oe global2017 corti
METID - Politecnico of Milan
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Cloudera, Inc.
Charlotte whiplash presentation
Charlotte whiplash presentationCharlotte whiplash presentation
Charlotte whiplash presentation
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Fuentes singulares
Fuentes singularesFuentes singulares
Fuentes singulares
Erwin Jose Rincón Caballero
Identificacion de peligros y evaluacion de riesgos en oficinas taller de nve...
Identificacion de peligros y evaluacion de riesgos en oficinas  taller de nve...Identificacion de peligros y evaluacion de riesgos en oficinas  taller de nve...
Identificacion de peligros y evaluacion de riesgos en oficinas taller de nve...
Alex Cumbicus Saavedra
Christopher c. greene 2017
Christopher c. greene 2017Christopher c. greene 2017
Christopher c. greene 2017
Christopher Greene, BSW MS.MBA
The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)
Cloudera, Inc.
Real-time Data Processing using AWS Lambda
Real-time Data Processing using AWS LambdaReal-time Data Processing using AWS Lambda
Real-time Data Processing using AWS Lambda
Amazon Web Services
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the Log
Ben Stopford

Viewers also liked (20)

Enabling the Connected Car Revolution

Enabling the Connected Car Revolution
Enabling the Connected Car Revolution

Enabling the Connected Car Revolution

Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Securing the Data Hub--Protecting your Customer IP (Technical Workshop)
Apache Beam
Apache Beam Apache Beam
Apache Beam
Derechos de libertad
Derechos de libertadDerechos de libertad
Derechos de libertad
Kafka & Couchbase Integration Patterns
Kafka & Couchbase Integration PatternsKafka & Couchbase Integration Patterns
Kafka & Couchbase Integration Patterns
Vectores en r3
Vectores en r3Vectores en r3
Vectores en r3
Oe global2017 corti
Oe global2017 cortiOe global2017 corti
Oe global2017 corti
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Charlotte whiplash presentation
Charlotte whiplash presentationCharlotte whiplash presentation
Charlotte whiplash presentation
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Fuentes singulares
Fuentes singularesFuentes singulares
Fuentes singulares
Identificacion de peligros y evaluacion de riesgos en oficinas taller de nve...
Identificacion de peligros y evaluacion de riesgos en oficinas  taller de nve...Identificacion de peligros y evaluacion de riesgos en oficinas  taller de nve...
Identificacion de peligros y evaluacion de riesgos en oficinas taller de nve...
Christopher c. greene 2017
Christopher c. greene 2017Christopher c. greene 2017
Christopher c. greene 2017
The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)The Vortex of Change - Digital Transformation (Presented by Intel)
The Vortex of Change - Digital Transformation (Presented by Intel)
Real-time Data Processing using AWS Lambda
Real-time Data Processing using AWS LambdaReal-time Data Processing using AWS Lambda
Real-time Data Processing using AWS Lambda
The Power of the Log
The Power of the LogThe Power of the Log
The Power of the Log

Similar to Kudu Forrester Webinar

Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph Technology
Streaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from Forrester
Streaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from ForresterStreaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from Forrester
Streaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from Forrester
Cubic Corporation
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
Confluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPointConfluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPoint
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Impetus Technologies
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data PlatformWebinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Enabling 360-degree Business Insights with SAP Data
Enabling 360-degree Business Insights with SAP DataEnabling 360-degree Business Insights with SAP Data
Enabling 360-degree Business Insights with SAP Data
Enterprise Management Associates
Leveraging Streaming Data through Automation
Leveraging Streaming Data through AutomationLeveraging Streaming Data through Automation
Leveraging Streaming Data through Automation
Enterprise Management Associates
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
Bigdata Meetup Kochi
IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer
Delivering Analytics at The Speed of Transactions with Data Fabric
Delivering Analytics at The Speed of Transactions with Data FabricDelivering Analytics at The Speed of Transactions with Data Fabric
Delivering Analytics at The Speed of Transactions with Data Fabric
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data InitiativesAccelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
☁Jake Weaver ☁
Modernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your DataModernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your Data
Strategy session 5 - unlocking the data dividend - andy steer
Strategy   session 5 - unlocking the data dividend - andy steerStrategy   session 5 - unlocking the data dividend - andy steer
Strategy session 5 - unlocking the data dividend - andy steer
Andy Steer
Big Data, Big Thinking: Untapped Opportunities
Big Data, Big Thinking: Untapped OpportunitiesBig Data, Big Thinking: Untapped Opportunities
Big Data, Big Thinking: Untapped Opportunities
SAP Technology
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Impetus Technologies
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter Jönsson
IBM Danmark
Navigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting EcosystemNavigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting Ecosystem
Workday, Inc.
Modern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and ImplementationsModern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and Implementations
David J Rosenthal
Top 10 Digital Transformation Trends For Business
Top 10 Digital Transformation Trends For BusinessTop 10 Digital Transformation Trends For Business
Top 10 Digital Transformation Trends For Business
Albiorix Technology

Similar to Kudu Forrester Webinar (20)

Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph Technology
Streaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from Forrester
Streaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from ForresterStreaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from Forrester
Streaming analytics webinar | 9.13.16 | Guest: Mike Gualtieri from Forrester
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
Confluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPointConfluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPoint
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data PlatformWebinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Enabling 360-degree Business Insights with SAP Data
Enabling 360-degree Business Insights with SAP DataEnabling 360-degree Business Insights with SAP Data
Enabling 360-degree Business Insights with SAP Data
Leveraging Streaming Data through Automation
Leveraging Streaming Data through AutomationLeveraging Streaming Data through Automation
Leveraging Streaming Data through Automation
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer IBM Relay 2015: Cloud is All About the Customer
IBM Relay 2015: Cloud is All About the Customer
Delivering Analytics at The Speed of Transactions with Data Fabric
Delivering Analytics at The Speed of Transactions with Data FabricDelivering Analytics at The Speed of Transactions with Data Fabric
Delivering Analytics at The Speed of Transactions with Data Fabric
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data InitiativesAccelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
Modernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your DataModernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your Data
Strategy session 5 - unlocking the data dividend - andy steer
Strategy   session 5 - unlocking the data dividend - andy steerStrategy   session 5 - unlocking the data dividend - andy steer
Strategy session 5 - unlocking the data dividend - andy steer
Big Data, Big Thinking: Untapped Opportunities
Big Data, Big Thinking: Untapped OpportunitiesBig Data, Big Thinking: Untapped Opportunities
Big Data, Big Thinking: Untapped Opportunities
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix WebinarFuture-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Future-Proof Your Streaming Analytics Architecture- StreamAnalytix Webinar
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter Jönsson
Navigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting EcosystemNavigating the Workday Analytics and Reporting Ecosystem
Navigating the Workday Analytics and Reporting Ecosystem
Modern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and ImplementationsModern Business Intelligence - Design and Implementations
Modern Business Intelligence - Design and Implementations
Top 10 Digital Transformation Trends For Business
Top 10 Digital Transformation Trends For BusinessTop 10 Digital Transformation Trends For Business
Top 10 Digital Transformation Trends For Business

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18

Recently uploaded

FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Alliance
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
AMol NAik
Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptxFIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Alliance
Top 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdfTop 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdf
Marrie Morris
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Alliance
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Alliance
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Alliance
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
Priyanka Aash
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
Stephanie Beckett
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
Peter Caitens
"Making .NET Application Even Faster", Sergey Teplyakov.pptx
"Making .NET Application Even Faster", Sergey Teplyakov.pptx"Making .NET Application Even Faster", Sergey Teplyakov.pptx
"Making .NET Application Even Faster", Sergey Teplyakov.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptxFIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Alliance
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partesExchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
Priyanka Aash
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
Priyanka Aash
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.

Recently uploaded (20)

FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptxFIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptx
Top 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdfTop 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdf
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
"Making .NET Application Even Faster", Sergey Teplyakov.pptx
"Making .NET Application Even Faster", Sergey Teplyakov.pptx"Making .NET Application Even Faster", Sergey Teplyakov.pptx
"Making .NET Application Even Faster", Sergey Teplyakov.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptxFIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partesExchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium

Kudu Forrester Webinar

  • 1. 1© Cloudera, Inc. All rights reserved. Apache Kudu Webinar Series Understanding and Unlocking the Value of Real-Time Data Ryan Lippert | Cloudera Michele Goetz | Forrester (Special Guest)
  • 2. 2© Cloudera, Inc. All rights reserved. Kudu Webinar Series Part 1: Lambda Architectures – Simplified by Apache Kudu A look into the potential trouble involved with a lambda architecture, and how Apache Kudu can dramatically simplify real-time analytics. Part 2: Extending the Capabilities of Operational and Analytical Databases An examination of how Apache Kudu expands the set of use cases that Cloudera’s Operational and Analytical databases can handle. Part 3: Data-in-Motion: Unlock the Value of Real-Time Data Forrester will discuss their research into real-time data pipelines and analytics, and Cloudera will discuss how to make it a reality. Part 4: Techincal Deep-Dive into Apache Kudu An in-depth examination of the technical architecture and design of Apache Kudu, straight from a PMC Member.
  • 3. 3© Cloudera, Inc. All rights reserved. Updateable Analytic Storage Simple real-time analytics and updates with Apache Kudu Kudu: Storage for fast analytics on fast data • Simplified architecture for building real-time analytic applications • Designed for next-generation hardware for faster analytic performance across frameworks • Native Hadoop storage engine Flexibility for the right tools for the right use case in one platform • Only analytic database for big data with Kudu + Impala • Simple real-time applications with Kudu + Spark Use cases • Time series data • Machine data analytics • Online reporting STRUCTURED Sqoop UNSTRUCTURED Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr OTHER Kite NoSQL HBase OTHER Object Store FILESYSTEM HDFS RELATIONAL Kudu
  • 4. 4© Cloudera, Inc. All rights reserved. Ingest data of any type or volume Process data as it arrives Serve data to users and applications Real-Time Data
  • 5. 5© Cloudera, Inc. All rights reserved. Agenda Drivers for agile, real-time data platforms The key use cases that are driving businesses towards real time platforms? Data on adoption trends for real-time technologies What is Forrester seeing in the market for real-time technologies? Deploying a real-time OSS achitecture to grow your business How can you build a scalable, cost-effective platform to grow your business?
  • 6. © 2017 FORRESTER. REPRODUCTION PROHIBITED. Michele Goetz Special Guest Speaker Principal Analyst Serving Enterprise Architecture Professionals
  • 7. 7© 2017 FORRESTER. REPRODUCTION PROHIBITED. Agenda Drivers for agile, real-time data platforms The key use cases that are driving businesses towards real time platforms? Data on adoption trends for real-time technologies What is Forrester seeing in the market for real-time technologies? Deploying a real-time OSS achitecture to grow your business How can you build a scalable, cost-effective platform to grow your business?
  • 8. 8© 2017 FORRESTER. REPRODUCTION PROHIBITED. Superior CX depends on data and insights
  • 9. 9© 2017 FORRESTER. REPRODUCTION PROHIBITED. Fraud and risk management requires real-time data
  • 10. 10© 2017 FORRESTER. REPRODUCTION PROHIBITED. IoT heat map shows where data matters most, now
  • 11. 11© 2017 FORRESTER. REPRODUCTION PROHIBITED. Data bottlenecks are catalysts for transition
  • 12. 12© 2017 FORRESTER. REPRODUCTION PROHIBITED. Create a road map for a real-time, agile data platform
  • 13. 13© 2017 FORRESTER. REPRODUCTION PROHIBITED. Agenda Drivers for agile, real-time data platforms The key use cases that are driving businesses towards real time platforms? Data on adoption trends for real-time technologies What is Forrester seeing in the market for real-time technologies? Deploying a real-time OSS achitecture to grow your business How can you build a scalable, cost-effective platform to grow your business?
  • 14. 14© 2017 FORRESTER. REPRODUCTION PROHIBITED. Leaders are focused on the technologies that allow data and insights to be consumed across the organization What are your firm's plans for the following data driven initiatives? Base: 3005 global data and analytics decision-makers. Source: Business Technographics® Global Data & Analytics Survey, 2016 51% 51% 51% 51% 51% 49% 52% 52% 54% 54% 58% 22% 22% 22% 22% 22% 24% 22% 23% 22% 23% 22% Creating an organizational center of excellence for business intelligence Combine content management and data management programs into a unified information management program Changing our processes to promote data stewardship and sharing Investing in platforms to and share out data content Creating a business led data stewardship or governance program Changing management incentives to promote data sharing Implementing analytics insights in software systems to aid customers or support employee decisions. Investing more in business friendly, self-service visualization and analytics Engaging external services providers or strategic business consultants for data and analytics or insights services Providing data preparation tools for self-service data management Investing in distributed real time insight delivery technology Expanding/Implemented Planning to implement within the next 12 months
  • 15. 15© 2017 FORRESTER. REPRODUCTION PROHIBITED. Base: 325 global data and analytics technology decision-makers. “Don’t know” not shown. Source: Business Technographics® Global Data & Analytics Survey, 2016 Which of the following describes your [TDM=”IT budget data and analytics technology or services”; BDM=”business budget for data and analytics technology or services”] from 2015 to 2016? 4% 5% 6% 6% 22% 26% 30% 0% 5% 10% 15% 20% 25% 30% 35% Decrease by 5% to 10% Don’t know Decrease by 1-4% Increase by more than 10% Increase by 5% to 10% Increase by 1-4% Stay about the same 54% of data and analytics technology decision-makers increased their budgets for data and analytics from 2015 to 2016 54%
  • 16. 16© 2017 FORRESTER. REPRODUCTION PROHIBITED. Companies of all sizes are spending millions for data & analytics Note: Don’t know excluded. Base: 765*, 1,288 global data and analytics decision makers Source: Business Technographics® Global Data & Analytics Survey, 2016 Please estimate, in millions, how much your data and analytics budget is for 2016? (Note: Number is in US Dollars) 55% 22% 9% 1% 1% 0% 0% 32% 30% 13% 4% 2% 2% 1% Less than $1 million $1 million to under $10 million $10 million to under $100 million $100 million to under $500 million $500 million to under $ 1 billion $1 billion to under $5 billion $5 billion or more SMB (20-999 employees)* Enterprise (1,000 or more employees)
  • 17. 17© 2017 FORRESTER. REPRODUCTION PROHIBITED. Among the DM technologies Forrester tracks, interest for stream processing tools has grown the most YoY What are your firm's plans to use the following data management technologies? Base: 2094 and *1805 global data and analytics technology decision-makers. Source: Business Technographics® Global Data & Analytics Survey, 2016 % with commitment % with interest, but no immediate plans +5 p.p. +3 p.p. -2 p.p. -1 p.p. -2 p.p. -3 p.p. % with commitment (expanding, implemented, or planning to implement in the next 12 months) 59% 61% 63% 63% 60% 59% 64% 64% 61% 62% 58% 56% Stream processing tools Inverted index database Distributed NoSQL databases Hadoop Associative index databases RDF, triple store -20% -19% -19% -20% -19% -19% -13% -13% -16% -14% -14% -13%
  • 18. 18© 2017 FORRESTER. REPRODUCTION PROHIBITED. Base: Total: 2094 Source: Business Technographics® Global Data & Analytics Survey, 2016 Which of the following are included in your plans for big data? 16% 18% 22% 23% 23% 26% 26% 27% 28% 30% 33% 36% 40% NoSQL other than Hadoop A MPP (massively parallel processing) data warehouse Semantic technologies (ontology building, search, auto curation, graph, etc.) Hadoop (including Hbase or Accumulo) Data anonymization or de-identification Creating or building out a data lake Marketing or digital data management platforms and service providers that brand their offerings as big data Packaged analytics technologies that brand themselves as big data Unstructured data mining / analytics Distributed in memory databases, grids, analytics tools Streaming analytics / computing Large scale predictive modeling, data mining or other advanced analytics Public cloud big data services Streaming analytics high in the list of big data plans
  • 19. 19© Cloudera, Inc. All rights reserved. Agenda Drivers for agile, real-time data platforms The key use cases that are driving businesses towards real time platforms? Data on adoption trends for real-time technologies What is Forrester seeing in the market for real-time technologies? Deploying a real-time OSS achitecture to grow your business How can you build a scalable, cost-effective platform to grow your business?
  • 20. 20© Cloudera, Inc. All rights reserved. Trend Towards Real-Time Data Platforms is Clear Drivers for Real-Time Platforms • Enhancing customer experiences • Risk Management • Advancement of IoT and broader instrumentation Adoption is Accelerating • Top data-driven initiative by investment: distributed delivery of real-time data • DM technology with highest momentum: stream processing • Top big data plans: streaming analytics is top 3 • Broad, large investments: 90% of decision makers are either continuing or increasing their investments in data and analytics; millions/billions being spent
  • 21. 21© Cloudera, Inc. All rights reserved. The Underlying Driver What drives a use case to real-time? High Frequency Trading APT Detection Fraud Detection Predictive Maintenance Next Best Offer Inventory Management Shipping/Logistic Systems CRM Systems Employee Management Strategic Planning Real-time data management use cases are defined by a common set of characteristics. • Narrow time window in which to make a decision (automated or manual) • Opportunity for the data points to change the decision path • Decreasing value of data over time Not all use cases have a pressing need for real-time data. • Broader strategic decisions, for example, do not require real-time data input • Over time, decreases in HW costs and increases in availability of real-time systems will lead most use cases to be conducted in real-time Real Time Some Latency Acceptable
  • 22. 22© Cloudera, Inc. All rights reserved. Moving to Real-Time and Leveraging Analytics What do we have to gain? “Monitoring System” Sensors are automatically monitored and programmed to deliver warnings when readings are delivered outside of an “optimal zone”. Basic models developed over small subsets of data. “Predictive System” Ingestion and processing of all sensor data into an unlimited data store with analytic capabilities enables machine learning, which can provide automated optimization and predictive maintenance. “Only 1 percent of data from an oil rig with 30,000 sensors is examined. The data that are used today are mostly for anomaly detection and control, not optimization and prediction, which provide the greatest value.” - McKinsey & Company Traditional Architectures Real-Time Analytic Capabilities
  • 23. 23© Cloudera, Inc. All rights reserved. Ingest data of any type or volume Process data as it arrives Serve data to users and applications Real-Time Data
  • 24. 24© Cloudera, Inc. All rights reserved. Ingestion at Cloudera • Apache Sqoop for data from relational databases • Apache Flume for logs, event based data • Apache Kafka is fast, scalable, and fault-tolerant messaging Partners, such as Streamsets, provide rich visualization tools Ingestion in Real-Time Stream Ingestion is a Must for Many Use Cases Ingestion isn’t just about internal business data anymore. • Traditional ingestion was internally focused, and often a matter of moving data from one silo or system to another • Today, businesses aim to take in data from a variety of external sources, IoT sensors, and machine-generated (user/network) data Your data journey can’t start until the data arrives. • Each step of the ingest/process/serve data pipeline must occur at real-time speed if decisions are to be made in time to affect the course of business Visualization help practitioners understand their data. • Complex tasks can be made less complex via graphical representations; data ingestion is no different
  • 25. 25© Cloudera, Inc. All rights reserved. Stream Processing at Cloudera Spark Streaming, the leading open-source framework for real- time use cases, is deployed in Cloudera’s real-time architectures. Cloudera has the broadest base of Hadoop-adjacent experience with Spark and integrating it with Apache components. Ingestion in Real-Time Unlocking Value at Speed For some use cases, batch just isn’t enough. • Batch processing can lead to bottlenecks and delays in data transformations that cause missed opportunities. Apache Spark is gaining momentum for a reason. • Leveraging Apache Spark for stream processing enables real- time use cases with sub-second latency and best-in-class API’s. Spark has a best-in-class ecosystem. • Machine learning (via MLlib) is seamlessly integrated into Spark. • Broadest set of vendors and contributors working on Spark among available processing engines, leading to rapid innovation.
  • 26. 26© Cloudera, Inc. All rights reserved. Data Serving at Cloudera Apache Kudu provides batch analysis and real-time serving within the same storage layer Apache HBase yields the best read/write performance Cloudera Search enables SQL-like faceted search in natural language Apache Kafka can be used to serve data to applications and users Serving in Real-Time Inject Data into Real-Time Decisions You need options that suit your use case. • Platform proliferation hurts IT departments as skillsets are divided; fewer platforms with broad capabilities help. Apache Kudu changes the game for open source software. • Combining real-time serving with analytic scans through a relational database had taken a complex lambda architecture until Kudu • Together, simplification and affordability should drive more use cases to real-time automated processes, in turn driving increased revenue, decreased risk, and better service for companies deploying Kudu
  • 27. 27© Cloudera, Inc. All rights reserved. HDFS Fast Scans, Analytics and Processing of Stored Data Fast On-Line Updates & Data Serving Arbitrary Storage (Active Archive) Fast Analytics (on fast-changing or frequently-updated data) Apache Kudu: Filling the Analytic Gap Unchanging Fast Changing Frequent Updates HBase Append-Only Real-Time Kudu Kudu fills the Gap Modern analytic applications often require complex data flow & difficult integration work to move data between HBase & HDFS Analytic Gap Pace of Analysis PaceofData
  • 28. 28© Cloudera, Inc. All rights reserved. Real-Time Data Analysis at Work Customer 360  “Next Best Offer 2.0” Kafka Spark Streaming Kudu Spark MLlib Application Data Sources Individual Session Customer Interaction Spark Full Model/Learning Data Request Sent For Stream Processing Data Cleaned/Ordered/Processed, Then Delivered to Kudu for Modelling User’s navigation returns the results they are looking for, in addition to offers and suggestions hyper-customized for them. Illustrative, models will likely have >2 dimensions
  • 29. 29© Cloudera, Inc. All rights reserved. Machine Learning Kudu opens the door to machine learning Kudu provides the ability to leverage real-time updates and analytic scans together - critical for many machine learning applications. Source: GHOSTS IN THE MACHINE: Artificial intelligence, risks and regulation in financial markets
  • 30. 30© Cloudera, Inc. All rights reserved. The Time for Real-Time Data and Analytics is Now. And the platform for it is Cloudera Enterprise.
  • 31. 31© Cloudera, Inc. All rights reserved.

Editor's Notes

  1. Ingest: Collecting the Data Today’s data-in-motion conversation, like the data journey itself, starts with ingestion. The increase in sensor-generated data associated with IoT, combined with the demands for social media data collection, has created a deluge of unstructured data that is difficult for organizations to contend with. As a common initial bottleneck in the data-in-motion journey, organizations often reach for a robust ingestion solution. However, it’s important to understand ingestion as part of a broader real-time data context; it’s a critical component, but only the first of three. Cloudera takes an open-source approach to ingestion, as it does with all three stages of the data-in-motion journey. Identifying the need for a streaming data capture system, Cloudera led the development of Apache Flume, the open standard for collecting and moving a vast amount of log data. The subsequent integration of Flume with Apache Kafka created an ingest architecture that has been replicated across Cloudera’s customer base in a variety of use cases. With Flume and Kafka, Cloudera deploys the leading streaming ingest platform. Flume can provide light weight agents deployed on edge nodes that number in the hundreds or thousands, each of which can be tiered to enable efficient ingest topologies. The integration between Kafka and Flume is bidirectional, meaning either component can be a producer or consumer of data depending on the specifics of your use case. A rising trend in data ingestion is the use of a rich visual interface that enables a user to interact with their ingestion architecture in an easy-to-use manner. While Cloudera delivers all the functionality underneath, we partner with best-in-class partners such as Streamsets, Cask, and others to deliver rich visualization. This enables Cloudera to focus on our core competency of data management, while enabling vendors with large engineering teams dedicated to visualization to focus on theirs. Portability, neutrality, and history of success for companies like Informatica,Talend, and others in similar spaces creates the best experience for our customers.
  2. Cloudera relies on Spark Streaming to process data once it is ingested. As the leading open-source processing framework for real-time use cases, Spark Streaming is an open standard and one of the most easily-recognizable components of the broader Apache Hadoop™ ecosystem. Cloudera has a the broadest base of Hadoop-adjacent experience with Apache Spark™ and Spark Streaming; this is a product of early adoption and integration of these projects into Cloudera Enterprise. CLOUDERA ENTERPRISE: THE INDUSTRY STANDARD FOR A COMPLETE DATA-IN-MOTION SOLUTION 5 WHITE PAPER Spark Streaming provides the strongest processing solution for data-in-motion use cases as a result of: • Best-in-class performance: - High throughput ensures that jobs will not bottleneck at the processing stage - Sub-second latency enables real-time capabilities • Best-in-class API and Features: - Easy-to-use SQL based API’s for authoring streaming jobs help expand the number of use cases and value of data in motion - “Exactly once” stream processing semantics help ensure accuracy - Sliding window computations enable fast insights into time period data slices - Built-in API’s for maintaining and updating in-memory information • Best-in-class ecosystem: - Largest set of vendors working with and around Spark among available processing engines, enabling access to latest innovations - Broadest and deepest machine learning library (MLib) is seamlessly integrated Spark Streaming from Cloudera, in particular, benefits users through the most robust integration into the ingestion and serving phases that bookend the data-in-motion story. This integration ensures a fast, easy, and secure delivery of processed data to the serving stage of data in motion.
  3. Whereas ingestion and processing have a relatively consistent flow irrespective of use case, the serving phase of a data-in-motion solution requires a variety of options in order to deliver the right data, to the right place, at the right time. Without this ability to quickly serve data to decision points, a solution loses its real-time capability and ceases to become a data-in-motion solution. Cloudera has a variety of options that help serve the diverse needs of individual use cases: • Apache Kudu™: A new, Cloudera-initiated Apache project, Kudu offers the unique ability to do fast scans on fast data. With an overwhelming number of data-in-motion use cases requiring analysis or visualization of streaming data, Kudu can enable the required batch analysis and real-time serving within the same storage layer. • Apache HBase™: HBase offers the best random read/write performance of any component within the Hadoop ecosystem. This capability, combined with high levels of concurrent access, enables online applications and operational needs that require the ability to query the latest data. • Cloudera Search: Powered by Apache Solr™, Cloudera Search democratizes data by enabling non-technical users to perform SQL-like, faceted search in natural language. Solr’s native integration into Cloudera Enterprise generates faster and more secure results. • Apache Kafka: Kafka’s fast, scalable, and durable design enables hundreds of megabytes of reads and writes per second, from thousands of clients.In addition to playing a role in ingestion, Kafka can be used to serve data to applications and users. This “last mile” step in the data-in-motion story is arguably the most critical step, which is why this breadth of options is necessary. Each use case, including the tendencies and workflows of the expected users, requires a different set of data access capabilities. Cloudera can meet any requirement through these tools, and can do so as the final step in an end-to-end data-in-motion story.
  4. Kudu allows you to have your cake and eat it too