Large scale near real-time log indexing with Flume and SolrCloud

1. Intro
2. Problem to solve?
3. How does Flume/Solr help?
4. Syslog indexing example
5. HA, DR & scalability

Ops Architect at Cisco CCATG (WebEx)
Ensure operational readiness for complex distributed services
HA, DR, monitoring, config, deployment
Previously eBay, Excite@Home, IBM, VISA
Operations architecture, monitoring, event correlation

© 2012 Cisco and/or its affiliates. All rights reserved. 5
Cisco WebEx Meetings
• Voice, video, desktop sharing
• Meeting/Event/Support/Training
• Centers
• Integration with TelePresence
Cisco WebEx Social
• Social networking
• Content creation
• Integrated IM
Cisco WebEx Messenger
• IM, presence
• Integrate with voice, video
• XMPP

C97-717209-00 © 2012 Cisco and/or its affiliates. All rights reserved. 6© 2010 Cisco and/or its affiliates. All rights reserved. 6
Participants from over 231 countries, 52% market share
2.2 Billion meeting minutes per month
40.5 Million meeting attendees per month
9.4 million registered hosts worldwide
4 Million mobile downloads

Datacenter / PoP
Leased network link
Global Scale: 13 datacenters &
iPoPs around the globe
Dedicated network: dual path
10G circuits between DCs
Multi-tenant: 95k sites
Real-time collaboration:
voice, desktop sharing, video, chat

Datacenter / PoP
Leased network link
People make mistakes
Hardware fails
Software fails
Even failovers sometimes fail

“If a problem has no solution, it may not be a problem,
but a fact, not to be solved, but to be coped with over time”
— Shimon Peres (“Peres’s Law”)
People/HW/SW failures are facts, not problems
Operations main goal is to maintain high service availability
• Recovery/repair is how we cope with above facts
• Improving recovery/repair improves availability
UnAvailability = MTTR / MTBF
1/10th MTTR just as valuable as 10x MTBF

Even better: proactive
Good: reactive
Your search – What is the root cause of the outage? – did not match any documents.

Flume
Log4j
File
Avro
Syslog
Other Sinks
Solr
Sink
Applicationstate&APIs
HDFS
Thrift
AMQP RDBMS
Sqoop
HTTP/REST
MySQL
Unstructured/semi-structured data Structured data
Cisco UCS C240 M3 servers
12 x 3TB = 36 TB / server
HDFS
Sink
SolrCloud
Raw dataSolr index

DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier

agent agent agent
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
…
Replicating
fan-out
flow
Flume Collector server
Failover & load
balancing agents
Flume Storage tier
All events replicated to
both Channels
DC1 DC2

DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier

File
Channel 1
Avro
src
Solr
Sink
HDFS
sink
File
Channel 2
…
Multiplexing
fan-out
flow
Flume Storage tier server
Failover & load
balancing agents
Flume
Collector
Flume
Collector
Flume
Collector
HDFSSolrCloud
Routing to Solr by
Flume event header
All events to HDFS

Isn’t Big Data “schema on read”?
• Why does Solr require a schema on write?
• Dirty little secret: there’s always a schema
• Performance & functionality vs flexibility
• Optimize operations and storage based on field type - that's how you
get sub second response times
There’s always a schema
• Application code vs. central location

Cloudera Morphlines
• Framework to simplify event transformation
• Compatible with existing grok patterns
• Reusable across multiple index workloads:
Flume & M/R
Command: readLine
Command: grok
Command: loadSolr
Solr
Flume event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…
Record
Record
Record
Record
SolrSink

Convert syslog message..
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com : %ACE-3-251008: Health probe
failed for server 10.240.22.111 on port 1234
.. into Solr schema fields
Severity=[3]
Facility=[22]
host=[colo01-wxp00-ace01b-connect.webex.com]
timestamp=[2013-06-16T04:36:49.000Z]
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111 on port 1234]
severity_label=[error]
access_token=[54asdf654]
id=[b2f839c3-dece-404f-a535-e0141ad549bf]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]
cisco_code=[%ACE-3-251008]

Convert syslog message
<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013 04:36:49 : %ACE-3-
251008: Health probe failed for server 10.240.22.111 on port 1234
Step 1: readLine reads in Flume event headers and body
timestamp=[1371357409000]
category=[545f5sfsd5sf]
Severity=[3]
Facility=[22]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013
04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port
1234]
Headers
Body

Step 2: convertTimestamp converts epoch to ISO 8601 format
timestamp=[2013-06-16T04:36:49.000Z]
access_token=[545f5sfsd5sf]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16 2013
04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111 on port
1234]
Severity=[3]
Facility=[22]

Step 3: addValues creates new field access_token
timestamp=[2013-06-16T04:36:49.000Z]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16
2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
Severity=[3]
Facility=[22]

Step 4: tryRules creates field severity_label for severity
timestamp=[2013-06-16T04:36:49.000Z]
message=[<179>Jun 16 04:36:49 colo01-wxp00-ace01b-connect.webex.com Jun 16
2013 04:36:49 : %ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
Severity=[3]
Facility=[22]

Step 5: tryRules creates new fields
syslog_message=[%ACE-3-251008: Health probe failed for server 10.240.22.111
on port 1234]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]

Step 6: sanitizeUnknownSolrFields drops non-schema fields
timestamp=[2013-06-16T04:36:49.000Z]
on port 1234]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]

Step 7: generateUUID creates an unique id for the document
timestamp=[2013-06-16T04:36:49.000Z]
on port 1234]
id=[b2f839c3-dece-404f-a535-e0141ad549bf]
cisco_product=[ACE]
cisco_level=[3]
cisco_id=[251008]

Step 8: loadSolr loads a record into a Solr server

Command: readLine
Command: grok
Command: loadSolr
SolrCloud
Flume syslog event = headers + body
Record
Document matching schema.xml
Command: tryRules
Command: addValues
…
Record
Record
Record
Record
SolrSink

ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem
(local, HDFS)
Add doc to syslog index
• Collections, shards & replicas
• Pluggable file system
• Central config & coordination with ZK
• Full HA, automatic fail-over
• NRT indexing
• Automatic routing
Where can I index data?
leader3
Collection

Collection “syslog” with
three shards

Special case of search
• Logs are time series data: timestamp + data
• High indexing rate, no updates
• New data is more frequently searched than old
Collection aliases
• Time partitioned collections – e.g. one collection per day
• Reduces the workload to near-real-time data only
• One-to-many collection mapping: queries go to a logical representation
mapped to multiple, same-schema collection
• Simplifies for hot-warm-cold migration of data
Index expiration
• Old data is aged out by Collection Aliases
• Remap only the latest collection to an alias

Solr
• No multi-datacenter cluster support
HDFS
• No multi-datacenter cluster support
Options?
• All our services must survive DC outage
• . . so should logging and indexing

DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
Flume
SolrCloud
Flume
Flume
DC 1
Flume
Flume
Flume
syslog log4j file
DC 2
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
…
Collector
tier
Storage
tierPlanned or
unplanned outage
Flume Collector
disk channel
buffering DC1
events
DC1 Hadoop cluster
back online after outage
Replicate
aggregate
data

DC 1
HDFS
Flume
SolrCloud
Flume
Flume
DC 2
HDFS
SolrCloud
DC 1
Flume
Flume
Flume
syslog log4j file
DC 2
Flume
Flume
Flume
syslog log4j file
DC N
Flume
Flume
Flume
syslog log4j file
… Collector
tier
Storage
tier
Flume
Flume
Flume
distcp
Manual CNAME
change to DC2
DC1 back
online, sync data
from DC2
Data sent only
to a single DC
distcp
DNS CNAME change
back to DC1
Flip distcp
the other way
Flume buffering events
at collector tier
Create indexes with M/R

Tiers to scale
• Flume Collector tier
• Flume Storage tier
• SolrCloud

100 – 5000 servers per a datacenter
agent agent agent
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
…
Replicating
fan-out
flow
agent agent agent …
…Flume Collector
More agents and data
FileChannel:
14MB/sec
NIC:
100MB/sec
NIC:
100MB/sec
File
Channel 1
Avro
src
DC1
Avro
sink
DC2
Avro
sink
File
Channel 2
Replicating
fan-out
flow
Max per server:
14MB/s
1.2 TB/day
70k events/s

DC 1 collectors
DC 1
storage tier
Flume 1
DC 2
storage tier
Avro
sink
1
Avro
sink
2
Avro
sink
N
…
DC 2 collectors
Avro
sink
1
Avro
sink
2
Avro
sink
N
…
DC N collectors
Avro
sink
1
Avro
sink
2
Avro
sink
N
……
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
File
Chan1
Avro
src
HDFS
sink
Solr
sink
File
Chan2
Multiplexing
fan-out
flow
Max per server:
14MB/s
1.2 TB/day
70k events/s

ZooKeeper
leader1
replica1
Shard1
leader2
replica2
Shard2
leader3
replica3
Shard3
SolrCloud cluster
zk1
zk2
zk3
Pluggable filesystem
(local, HDFS)
New logs
to index
Search
queries
1000
tx/sec/core
2x8 cores
16k tx/sec
3 shards
3 x 16k =
48k tx/sec

Central syslog servers
• Network and OS system messages forwarded to several central syslog
servers
Forward syslog to Solr using Flume Morphline SolrSink
• Parse messages with Morphline and grok patterns
SolrCloud
• Index log lines as documents into a Collection (i.e. index)
HUE Solr search
• Simple UI to build a customized search page layout with faceting, sorting.
• Easy drill down with multiple facets: severity, datacenter, hostname, etc

Screen shots

Search by time
Sort by select field
Facets by selected fields

Wildcard query by field
Highlight the query
keywords

Data sources: REST/JSON, log4j, syslog, Avro, Thrift
Parsing: Cloudera Morphlines
NRT Indexing: SolrCloud embedded in CDH
Batch indexing: MapReduce
Analytics: Use your favorite tool, raw detailed data stored in HDFS

email: ari.flink@webex.com
twitter: @raaka

Thank you.

Large scale near real-time log indexing with Flume and SolrCloud

Related slideshows

More Related Content

What's hot

What's hot (20)

Similar to Large scale near real-time log indexing with Flume and SolrCloud

Similar to Large scale near real-time log indexing with Flume and SolrCloud (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Large scale near real-time log indexing with Flume and SolrCloud

Editor's Notes