SlideShare a Scribd company logo
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs 
Radu Gheorghe 
@radu0gheorghe @sematext
/me does... 
search consulting + Logsene = logging consulting
Tuning. Is it worth it? 
baseline last run 
# of logs 10M 310M 
EC2 bill/month 700 450
What to optimize for? 
capacity: how many logs 
the same hardware can keep 
while still providing decent 
What's decent performance? “It depends” 
indexing: enough to keep up with generated logs* 
search concurrency 
search latency: 2s for debug queries, 5s for charts 
*account for spikes!
Enough theory, let's start testing! 
Solr instance 
m3.2xlarge (8CPU, 30GB RAM, 2x80GB SSD) 
Solr 4.10.1 
Feeder instance 
c3.2xlarge (8CPU, 15GB RAM, 2x80GB SSD) 
apache access logs 
python script to parse and feed them
Baseline test 
15GB heap 
debug query 
status:404 in the last hour 
charts query 
all time status counters 
all time top IPs 
user agent word cloud
Baseline result 
100K 2.5M 4M 6M 9M 10M 
100K 2.5M 4M 6M 9M 10M 
Baseline result 
100K 2.5M 4M 6M 9M 10M 
Baseline result 
bottleneck: facets eat CPU
100K 2.5M 4M 6M 9M 10M 
Baseline result 
on average, bottleneck: facets eat CPU 
100K 2.5M 4M 6M 9M 10M 
indexing limited 
because python 
scripts eats 
feeder CPU 
Baseline result 
bottleneck: facets eat CPU 
on average, 
Indexing throughput: is it enough? 
“it depends” 
how long do you keep your logs? 
1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 
1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day 
Baseline run: 10M index fills up in <1/2h at 7K EPS
Indexing throughput: is it enough? 
“it depends” 
how long do you keep your logs? 
1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 
1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day 
how big are your spikes? (assumption: 10x regular load) 
7K EPS is enough for 10M capacity if you keep logs >5h
1.5M 3M 5M 8M 11M 
Rare commits 
10% above baseline 
auto soft commits every 5 seconds 
auto hard commits every 30 minutes 
RAMBufferSize=200MB; maxBufferedDocs=10M
Same results with 
even rarer commits (auto-soft every 30s, 500MB buffer) 
omitNorms + omitTermFreqAndPositions 
larger caches 
cache autowarming 
THP disabled 
mergeFactor 5 
mergeFactor 20 
but indexing 
was cheaper 
manually ran 
queries, too
1.5M 3M 5M 8M 10M 12M 
DocValues on IP and status code 
20% above baseline
3M 10M 18M 24M 31M 36M 
Detour: what if user agent was string? 
3.6x baseline
8M 16M 24M 32M 40M 48M 56M 64M 67M 69M 70M 70.5M 
… and if user agent used DocValues? 
6.7x baseline 
reducing indexing 
adds 5% capacity
3M 7M 11M 15M 19M 23M 27M 28M 
OOM (150 collections) 
Time based collections (1 minute) 
2.7x baseline
10M 40M 70M 100M 130M 160M 190M 213M 
still OOM 
(~100 collections) 
Time based collections (10 minutes) 
21x baseline
10min collections: 20GB heap; optimize old 
50M 100M 150M 200M 250M 300M 310M 330M 340M 
31x baseline, 
5 days projected retention 
with 10x spikes 
no more OOM, 
just slower queries 
34x baseline, 
10 days projected 
retention (10x)
Software optimizations recap 
Definitely worth it Nice to have I wouldn't bother 
noop I/O scheduler merge policy tuning 
DocValues omit norms, term 
frequencies and 
rare soft commits optimize “old” 
super-rare soft 
disable THP
r3.2xlarge: +30GB RAM, +$0.14/h, 1x160GB SSD 
20M 70M 120M 170M 220M 270M 320M 372M 
less indexing throughput 
than m3.2xlarge 
37x baseline, 
9 days projected retention 
with 10x spikes
20M 50M 80M 110M 140M 170M 177M 
c3.2xlarge: -15GB RAM, -$0.14/h 
17x baseline, 
5 days projected retention 
with 10x spikes
Monthly EC2 cost per 1M logs* 
m3.2xlarge: $1.3 
r3.2xlarge: $1.33 
c3.2xlarge: $1.78 
TODO (a.k.a. truth always messes with simplicity): 
more/expensive facets => more CPU => c3 looks better 
less/cheap facets => not enough instance storage 
=> EBS (magnetic/SSD/provisioned IOPS)? 
=> storage-optimized i2? 
=> old-gen instances with magnetic instance storage? 
use different instance types for “hot” and “cold” collections? 
*on-demand pricing at 2014-11-07
How NOT to build an indexing pipeline 
custom script: 
reads apache logs from files 
parses them using regex 
takes 100% CPU and 100% RAM 
from a c3.2xlarge instance 
maxes out at 7K EPS
Enter Apache Flume* 
agent.sources = spoolSrc 
agent.sources.spoolSrc.type = spooldir 
agent.sources.spoolSrc.spoolDir = /var/log 
agent.sources.spoolSrc.channels = solrChannel 
agent.channels = solrChannel 
agent.channels.solrChannel.type = file = solrChannel 
put Solr and Morphline 
jars in lib/ 
agent.sinks = solrSink 
agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink 
agent.sinks.solrSink.morphlineFile = conf/morphline.conf 
agent.sinks.solrSink.morphlineId = 1 
*Or Logstash. Or rsyslog. Or syslog-ng. Or any other specialized event processing tool 
morphline.conf (think Unix pipes) 
morphlines : [ 
{ id : 1 
commands : [ 
same ID as in the flume.conf 
sink definition 
{ readLine { charset : UTF-8 } } 
grok { 
dictionaryFiles : [conf/grok-patterns] 
expressions : { 
message : """%{COMBINEDAPACHELOG}""" 
{ generateUUID { field : id } } 
loadSolr { 
solrLocator : { 
collection : collection1 
solrUrl : "" 
process one line at a time 
(there's also readMultiLine) 
parses each property 
(eg: IP, status code) 
Solr can in its own field 
do it, too* 
use zkHost 
for SolrCloud 
Result: 2.4K EPS, feeder machine almost idle
2.4K EPS is typically enough for this 
application server 
+ Flume agent 
application server 
+ Flume agent 
application server 
+ Flume agent 
scales nicely with # of servers 
but all buffering and processing 
is done here
but not for this 
application server 
+ Flume agent 
application server 
+ Flume agent 
application server 
+ Flume agent 
centralized buffering 
and processing 
Flume agent 
Flume agent
or this 
application server 
+ Flume agent 
application server 
+ Flume agent 
application server 
+ Flume agent 
buffer, then process (separately) 
Flume agent 
Flume agent 
Flume agent
Increase throughput: batch sizes; memory channel 
agent.sources = spoolSrc 
agent.sources.spoolSrc.type = spooldir 
agent.sources.spoolSrc.spoolDir = /var/log 
agent.sources.spoolSrc.batchSize = 5000 
make sure you have enough heap 
agent.sources.spoolSrc.channels = solrChannel 
agent.channels = solrChannel 
agent.channels.solrChannel.type = file memory 
agent.channels.solrChannel.capacity = 1000000 
agent.channels.solrChannel.transactionCapacity = 5000 = solrChannel 
solrLocator : { 
collection : collection1 
solrUrl : "" 
batchSize : 5000 
agent.sinks = solrSink 
agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink 
agent.sinks.solrSink.morphlineFile = conf/morphline.conf 
agent.sinks.solrSink.morphlineId = 1 
agent.sinks.solrSink.batchSize = 5000
Result: 10K EPS, 6%CPU usage (2x baseline)
More throughput? Parallelize 
Depends* on the bottleneck 
source channel sink 
more threads 
(if applicable) 
more sources 
*last time I use this word, I promise 
channel selector 
more threads 
(if applicable) 
load balancing 
sink processor 
Source1 C1 
C1 Sink1 
Result: default Solr install maxed out at 24K EPS
TODO: log in JSON where you can 
Then, in morphline.conf, replace the grok command with the much ligher: 
readJson {} 
Easy with apache logs, maybe not for other apps: 
LogFormat "{  
"@timestamp": "%{%Y-%m-%dT%H:%M:%S%z}t",  
"message": "%h %l %u %t "%r" %>s %b",  
"method": "%m",  
"referer": "%{Referer}i",  
"useragent": "%{User-agent}i"  
}" ls_apache_json 
CustomLog /var/log/apache2/logstash_test.ls_json ls_apache_json 
More details at:
Use time-based collections and DocValues 
Rare soft&hard commits are good 
Pushing them too far is probably not worth it 
Hardware: test and see what works for you 
A balanced, SSD-backed machine (like m3) is a good start 
Use specialized event processing tools 
Apache Flume is a fine example 
Processing and buffering on the application server side scales better 
Buffer before [heavy] processing 
Mind your batch sizes, buffer types and parallelization 
Log in JSON where you can
Thank you! 
Feel free to poke me @radu0gheorghe 
Check us out at the booth, and @sematext 
We're hiring, too!

More Related Content

What's hot

Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
Sematext Group, Inc.
ELK stack at
ELK stack at weibo.comELK stack at
ELK stack at
琛琳 饶
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
Sematext Group, Inc.
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and Elasticsearch
Rafał Kuć
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
Centralized + Unified Logging
Centralized + Unified LoggingCentralized + Unified Logging
Centralized + Unified Logging
Gabor Kozma
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introduction
Owen Wu
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
Alejandro E Brito Monedero
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209
Tim Bunce
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
Tim Bunce
{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4
琛琳 饶
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
Tim Bunce
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Data Con LA
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12
N Masahiro
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Sematext Group, Inc.
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDB

What's hot (20)

Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
ELK stack at
ELK stack at weibo.comELK stack at
ELK stack at
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and Elasticsearch
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
Centralized + Unified Logging
Centralized + Unified LoggingCentralized + Unified Logging
Centralized + Unified Logging
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introduction
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDB

Similar to Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext

Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Amazon Web Services
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
Amazon Web Services
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load Testing
Mike Harnish
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Amazon Web Services
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
Chris Adkin
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
Amazon Web Services
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
Redis Labs
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
Amazon Web Services
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
Amazon Web Services
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyOptimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Henning Jacobs
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Henning Jacobs
Scaling an ELK stack at
Scaling an ELK stack at bol.comScaling an ELK stack at
Scaling an ELK stack at
Renzo Tomà
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Henning Jacobs
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
Prem Kuppumani
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbag
Gordon Chung

Similar to Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext (20)

Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load Testing
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyOptimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Scaling an ELK stack at
Scaling an ELK stack at bol.comScaling an ELK stack at
Scaling an ELK stack at
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbag

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond

Recently uploaded

Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...
Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...
Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...
Andre Hora
Mastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GISMastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GIS
Safe Software
Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...
Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...
Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...
David D. Scott
Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()
Gene Gotimer
Understanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdfUnderstanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdf
03. Ruby Variables & Regex - Ruby Core Teaching
03. Ruby Variables & Regex - Ruby Core Teaching03. Ruby Variables & Regex - Ruby Core Teaching
03. Ruby Variables & Regex - Ruby Core Teaching
How to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at ScaleHow to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at Scale
iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...
iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...
iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
Shane Coughlan
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdfTop 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Banibro IT Solutions
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing ToolsOld Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Benjamin Bischoff
CrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNewsCrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNews
Eman Nisar
4. The Build System _ Embedded Android.pdf
4. The Build System _ Embedded Android.pdf4. The Build System _ Embedded Android.pdf
4. The Build System _ Embedded Android.pdf
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery SolutionBDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching
UW Cert degree offer diploma
UW Cert degree offer diploma UW Cert degree offer diploma
UW Cert degree offer diploma
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdfApplitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdf
Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...
Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...
Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...
Alluxio, Inc.
How Generative AI is Shaping the Future of Software Application Development
How Generative AI is Shaping the Future of Software Application DevelopmentHow Generative AI is Shaping the Future of Software Application Development
How Generative AI is Shaping the Future of Software Application Development
Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)
Andre Hora

Recently uploaded (20)

Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...
Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...
Monitoring the Execution of 14K Tests: Methods Tend to Have One Path that Is ...
Mastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GISMastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GIS
Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...
Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...
Tube Magic Software | Youtube Software | Best AI Tool For Growing Youtube Cha...
Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()
Understanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdfUnderstanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdf
03. Ruby Variables & Regex - Ruby Core Teaching
03. Ruby Variables & Regex - Ruby Core Teaching03. Ruby Variables & Regex - Ruby Core Teaching
03. Ruby Variables & Regex - Ruby Core Teaching
How to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at ScaleHow to Secure Your Kubernetes Software Supply Chain at Scale
How to Secure Your Kubernetes Software Supply Chain at Scale
iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...
iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...
iBirds Services - Comprehensive Salesforce CRM and Software Development Solut...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdfTop 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing ToolsOld Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
CrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNewsCrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNews
4. The Build System _ Embedded Android.pdf
4. The Build System _ Embedded Android.pdf4. The Build System _ Embedded Android.pdf
4. The Build System _ Embedded Android.pdf
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery SolutionBDRSuite - #1 Cost effective Data Backup and Recovery Solution
BDRSuite - #1 Cost effective Data Backup and Recovery Solution
04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching04. Ruby Operators Slides - Ruby Core Teaching
04. Ruby Operators Slides - Ruby Core Teaching
UW Cert degree offer diploma
UW Cert degree offer diploma UW Cert degree offer diploma
UW Cert degree offer diploma
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdfApplitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdf
Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...
Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...
Alluxio Webinar | What’s new in Alluxio Enterprise AI 3.2: Leverage GPU Anywh...
How Generative AI is Shaping the Future of Software Application Development
How Generative AI is Shaping the Future of Software Application DevelopmentHow Generative AI is Shaping the Future of Software Application Development
How Generative AI is Shaping the Future of Software Application Development
Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)

Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext

  • 2. Tuning Solr for Logs Radu Gheorghe @radu0gheorghe @sematext
  • 3. /me does... .com/logsene search consulting + Logsene = logging consulting
  • 4. Tuning. Is it worth it? baseline last run # of logs 10M 310M EC2 bill/month 700 450
  • 5. What to optimize for? capacity: how many logs the same hardware can keep while still providing decent performance
  • 6. What's decent performance? “It depends” Assumptions indexing: enough to keep up with generated logs* search concurrency search latency: 2s for debug queries, 5s for charts *account for spikes!
  • 7. Enough theory, let's start testing! Solr instance m3.2xlarge (8CPU, 30GB RAM, 2x80GB SSD) Solr 4.10.1 Feeder instance c3.2xlarge (8CPU, 15GB RAM, 2x80GB SSD) apache access logs python script to parse and feed them
  • 8. Baseline test 15GB heap debug query status:404 in the last hour charts query all time status counters all time top IPs user agent word cloud
  • 9. Baseline result 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS
  • 10. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS Baseline result capacity
  • 11. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS Baseline result capacity bottleneck: facets eat CPU
  • 12. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS Baseline result capacity on average, bottleneck: facets eat CPU CPU is OK
  • 13. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M indexing limited because python scripts eats feeder CPU debug charts EPS Baseline result capacity bottleneck: facets eat CPU on average, CPU is OK
  • 14. Indexing throughput: is it enough? “it depends” how long do you keep your logs? 1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day Baseline run: 10M index fills up in <1/2h at 7K EPS
  • 15. Indexing throughput: is it enough? “it depends” how long do you keep your logs? 1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day how big are your spikes? (assumption: 10x regular load) 7K EPS is enough for 10M capacity if you keep logs >5h
  • 16. 8000 7000 6000 5000 4000 3000 2000 1000 0 1.5M 3M 5M 8M 11M charts EPS debug Rare commits 10% above baseline auto soft commits every 5 seconds auto hard commits every 30 minutes RAMBufferSize=200MB; maxBufferedDocs=10M
  • 17. Same results with even rarer commits (auto-soft every 30s, 500MB buffer) omitNorms + omitTermFreqAndPositions larger caches cache autowarming THP disabled mergeFactor 5 mergeFactor 20 but indexing was cheaper manually ran queries, too
  • 18. 8000 7000 6000 5000 4000 3000 2000 1000 0 1.5M 3M 5M 8M 10M 12M charts EPS debug DocValues on IP and status code 20% above baseline
  • 19. 8000 7000 6000 5000 4000 3000 2000 1000 0 3M 10M 18M 24M 31M 36M charts EPS debug Detour: what if user agent was string? 3.6x baseline
  • 20. 8000 7000 6000 5000 4000 3000 2000 1000 0 8M 16M 24M 32M 40M 48M 56M 64M 67M 69M 70M 70.5M charts EPS debug … and if user agent used DocValues? 6.7x baseline reducing indexing adds 5% capacity
  • 21. 35000 30000 25000 20000 15000 10000 5000 0 3M 7M 11M 15M 19M 23M 27M 28M OOM (150 collections) charts EPS debug Time based collections (1 minute) 2.7x baseline
  • 22. 8000 7000 6000 5000 4000 3000 2000 1000 0 10M 40M 70M 100M 130M 160M 190M 213M still OOM (~100 collections) charts EPS debug Time based collections (10 minutes) 21x baseline
  • 23. 10min collections: 20GB heap; optimize old 8000 7000 6000 5000 4000 3000 2000 1000 0 50M 100M 150M 200M 250M 300M 310M 330M 340M charts EPS debug 31x baseline, 5 days projected retention with 10x spikes no more OOM, just slower queries 34x baseline, 10 days projected retention (10x)
  • 24. Software optimizations recap Definitely worth it Nice to have I wouldn't bother time-based collections noop I/O scheduler merge policy tuning DocValues omit norms, term frequencies and positions autowarm rare soft commits optimize “old” collections super-rare soft commits disable THP
  • 25. r3.2xlarge: +30GB RAM, +$0.14/h, 1x160GB SSD 7000 6000 5000 4000 3000 2000 1000 0 20M 70M 120M 170M 220M 270M 320M 372M less indexing throughput than m3.2xlarge charts EPS debug 37x baseline, 9 days projected retention with 10x spikes
  • 26. 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 20M 50M 80M 110M 140M 170M 177M charts EPS debug c3.2xlarge: -15GB RAM, -$0.14/h 17x baseline, 5 days projected retention with 10x spikes
  • 27. Monthly EC2 cost per 1M logs* m3.2xlarge: $1.3 r3.2xlarge: $1.33 c3.2xlarge: $1.78 TODO (a.k.a. truth always messes with simplicity): more/expensive facets => more CPU => c3 looks better less/cheap facets => not enough instance storage => EBS (magnetic/SSD/provisioned IOPS)? => storage-optimized i2? => old-gen instances with magnetic instance storage? use different instance types for “hot” and “cold” collections? *on-demand pricing at 2014-11-07
  • 28. How NOT to build an indexing pipeline custom script: reads apache logs from files parses them using regex takes 100% CPU and 100% RAM from a c3.2xlarge instance maxes out at 7K EPS
  • 29. Enter Apache Flume* agent.sources = spoolSrc agent.sources.spoolSrc.type = spooldir agent.sources.spoolSrc.spoolDir = /var/log agent.sources.spoolSrc.channels = solrChannel agent.channels = solrChannel agent.channels.solrChannel.type = file = solrChannel put Solr and Morphline jars in lib/ agent.sinks = solrSink agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent.sinks.solrSink.morphlineFile = conf/morphline.conf agent.sinks.solrSink.morphlineId = 1 *Or Logstash. Or rsyslog. Or syslog-ng. Or any other specialized event processing tool source channel sink
  • 30. morphline.conf (think Unix pipes) morphlines : [ { id : 1 commands : [ same ID as in the flume.conf sink definition { readLine { charset : UTF-8 } } { grok { dictionaryFiles : [conf/grok-patterns] expressions : { message : """%{COMBINEDAPACHELOG}""" } } } { generateUUID { field : id } } { loadSolr { solrLocator : { collection : collection1 solrUrl : "" } } } ] } ] process one line at a time (there's also readMultiLine) parses each property (eg: IP, status code) Solr can in its own field do it, too* use zkHost for SolrCloud *
  • 31. Result: 2.4K EPS, feeder machine almost idle
  • 32. 2.4K EPS is typically enough for this application server + Flume agent application server + Flume agent application server + Flume agent scales nicely with # of servers but all buffering and processing is done here
  • 33. but not for this application server + Flume agent application server + Flume agent application server + Flume agent centralized buffering and processing Flume agent Flume agent
  • 34. or this application server + Flume agent application server + Flume agent application server + Flume agent buffer, then process (separately) Flume agent Flume agent Flume agent
  • 35. Increase throughput: batch sizes; memory channel agent.sources = spoolSrc agent.sources.spoolSrc.type = spooldir agent.sources.spoolSrc.spoolDir = /var/log agent.sources.spoolSrc.batchSize = 5000 make sure you have enough heap agent.sources.spoolSrc.channels = solrChannel agent.channels = solrChannel agent.channels.solrChannel.type = file memory agent.channels.solrChannel.capacity = 1000000 agent.channels.solrChannel.transactionCapacity = 5000 = solrChannel solrLocator : { collection : collection1 solrUrl : "" batchSize : 5000 } agent.sinks = solrSink agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent.sinks.solrSink.morphlineFile = conf/morphline.conf agent.sinks.solrSink.morphlineId = 1 agent.sinks.solrSink.batchSize = 5000
  • 36. Result: 10K EPS, 6%CPU usage (2x baseline)
  • 37. More throughput? Parallelize Depends* on the bottleneck source channel sink more threads (if applicable) more sources *last time I use this word, I promise multiplexing channel selector more threads (if applicable) load balancing sink processor Source1 C1 Source1 C1 Source2 Source1 C1 C2 C1 Sink1 C1 Sink1 Sink2
  • 38. Result: default Solr install maxed out at 24K EPS
  • 39. TODO: log in JSON where you can Then, in morphline.conf, replace the grok command with the much ligher: readJson {} Easy with apache logs, maybe not for other apps: LogFormat "{ "@timestamp": "%{%Y-%m-%dT%H:%M:%S%z}t", "message": "%h %l %u %t "%r" %>s %b", ... "method": "%m", "referer": "%{Referer}i", "useragent": "%{User-agent}i" }" ls_apache_json CustomLog /var/log/apache2/logstash_test.ls_json ls_apache_json More details at:
  • 40. Conclusions Use time-based collections and DocValues Rare soft&hard commits are good Pushing them too far is probably not worth it Hardware: test and see what works for you A balanced, SSD-backed machine (like m3) is a good start Use specialized event processing tools Apache Flume is a fine example Processing and buffering on the application server side scales better Buffer before [heavy] processing Mind your batch sizes, buffer types and parallelization Log in JSON where you can
  • 41. Thank you! Feel free to poke me @radu0gheorghe Check us out at the booth, and @sematext We're hiring, too!