Search at Twitter: Presented by Michael Busch, Twitter

Search @twitter
Michael Busch
@michibusch
michael@twitter.com
buschmi@apache.org

Search @twitter
Agenda
‣ Introduction
- Search Architecture
- Lucene Extensions
- Outlook

Introduction
Twitter has more than 284 million
monthly active users.

Introduction
500 million tweets are sent per day.

Introduction
More than 300 billion tweets have been
sent since company founding in 2006.

Introduction
Tweets-per-second record:
one-second peak of 143,199 TPS.

Introduction
More than 2 billion search queries per
day.

Search @twitter
Agenda
- Introduction
‣ Search Architecture
- Lucene Extensions
- Outlook

RT index
Search Architecture
RT stream
Analyzer/
Partitioner
RT index
(Earlybird)
Blender
Archive
index
RT index
Mapreduce
Analyzer
raw
tweets
Tweet archive
HDFS
Search
requests
writes
searches
analyzed
tweets
analyzed
tweets
raw
tweets

RT index
Search Architecture
Tweets
Analyzer/
Partitioner
RT index
(Earlybird)
Blender
Archive
index
RT index
queue
HDFS
Search
requests
Updates Deletes/
Engagement (e.g. retweets/favs)
writes
searches
Mapreduce
Analyzer

RT index
Search Architecture
RT index
(Earlybird)
Social
graph Social
Blender
Archive
index
RT index
User
search
Search
requests
writes
searches
• Blender is our Thrift
service aggregator
• Queries multiple
Earlybirds, merges results
Social
graph
graph

Search Architecture
RT index
(Earlybird)
Archive
index
User
search

Search Architecture
RT index
(Earlybird)
Archive
index
• For historic reasons, these used
to be entirely different codebases,
but had similar features/
technologies
• Over time cross-dependencies
were introduced to share code
User
search
Lucene

Search Architecture
RT index
(Earlybird)
Archive
index
User
search
Lucene
Extensions
Lucene
• New Lucene extension package
• This package is truly generic and
has no dependency on an actual
product/index
• It contains Twitter’s extensions for
real-time search, a thin segment
management layer and other
features

Search @twitter
Agenda
- Introduction
‣ Lucene Extensions
- Outlook

Lucene Extension Library
• Abstraction layer for Lucene index segments
• Real-time writer for in-memory index segments
• Schema-based Lucene document factory
• Real-time faceting

• API layer for Lucene segments
• *IndexSegmentWriter
• *IndexSegmentAtomicReader
• Two implementations
• In-memory: RealtimeIndexSegmentWriter (and reader)
• On-disk: LuceneIndexSegmentWriter (and reader)

• IndexSegments can be built ...
• in realtime
• on Mesos or Hadoop (Mapreduce)
• locally on serving machines
• Cluster-management code that deals with IndexSegments
• Share segments across serving machines using HDFS
• Can rebuild segments (e.g. to upgrade Lucene version, change data
schema, etc.)

HDFS EEEaararlyrlylbybbirirdirdd
Mesos
Hadoop (MR)
RT pipeline

RealtimeIndexSegmentWriter
• Modified Lucene index implementation optimized for realtime search
• IndexWriter buffer is searchable (no need to flush to allow searching)
• In-memory
• Lock-free concurrency model for best performance

Concurrency - Definitions
• Pessimistic locking
• A thread holds an exclusive lock on a resource, while an action is
performed [mutual exclusion]
• Usually used when conflicts are expected to be likely
• Optimistic locking
• Operations are tried to be performed atomically without holding a lock;
conflicts can be detected; retry logic is often used in case of conflicts
• Usually used when conflicts are expected to be the exception

Concurrency - Definitions
• Non-blocking algorithm
Ensures, that threads competing for shared resources do not have their
execution indefinitely postponed by mutual exclusion.
• Lock-free algorithm
A non-blocking algorithm is lock-free if there is guaranteed system-wide
progress.
• Wait-free algorithm
A non-blocking algorithm is wait-free, if there is guaranteed per-thread
progress.
* Source: Wikipedia

Concurrency
• Having a single writer thread simplifies our problem: no locks have to be used
to protect data structures from corruption (only one thread modifies data)
• But: we have to make sure that all readers always see a consistent state of
all data structures -> this is much harder than it sounds!
• In Java, it is not guaranteed that one thread will see changes that another
thread makes in program execution order, unless the same memory barrier is
crossed by both threads -> safe publication
• Safe publication can be achieved in different, subtle ways. Read the great
book “Java concurrency in practice” by Brian Goetz for more information!

Java Memory Model
• Program order rule
Each action in a thread happens-before every action in that thread that comes
later in the program order.
• Volatile variable rule
A write to a volatile field happens-before every subsequent read of that same
field.
• Transitivity
If A happens-before B, and B happens-before C, then A happens-before C.
* Source: Brian Goetz: Java Concurrency in Practice

Concurrency
RAM 0
int x;
Cache
Thread 1 Thread 2
time

Concurrency
Cache 5
RAM 0
int x;
Thread 1 Thread 2
x = 5;
Thread A writes x=5 to cache
time

Concurrency
Cache 5
RAM 0
int x;
Thread 1 Thread 2
x = 5;
time while(x != 5);
This condition will likely
never become false!

Concurrency
RAM 0
int x;
Thread A writes b=1 to RAM,
because b is volatile
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;

Concurrency
RAM 0
int x;
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;
Read volatile b
int dummy = b;
while(x != 5);

Concurrency
RAM 0
int x;
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;
int dummy = b;
while(x != 5);
happens-before
• Program order rule: Each action in a thread happens-before every action in
that thread that comes later in the program order.

Concurrency
RAM 0
int x;
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;
int dummy = b;
while(x != 5);
happens-before
• Volatile variable rule: A write to a volatile field happens-before every
subsequent read of that same field.

Concurrency
RAM 0
int x;
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;
int dummy = b;
while(x != 5);
happens-before
• Transitivity: If A happens-before B, and B happens-before C, then A
happens-before C.

Concurrency
RAM 0
int x;
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;
int dummy = b;
while(x != 5);
This condition will be
false, i.e. x==5
• Note: x itself doesn’t have to be volatile. There can be many variables like x,
but we need only a single volatile field.

Concurrency
RAM 0
int x;
5 x = 5;
1
Cache
Thread 1 Thread 2
time
volatile int b;
b = 1;
int dummy = b;
while(x != 5);
Memory barrier
• Note: x itself doesn’t have to be volatile. There can be many variables like x,
but we need only a single volatile field.

Concurrency
IndexWriter IndexReader
time
write 100 docs
maxDoc = 100
in IR.open(): read maxDoc
search upto maxDoc
write more docs
maxDoc is volatile

Concurrency
IndexWriter IndexReader
time
write 100 docs
maxDoc = 100
in IR.open(): read maxDoc
search upto maxDoc
write more docs
maxDoc is volatile
happens-before
• Only maxDoc is volatile. All other fields that IW writes to and IR reads from
don’t need to be!

Wait-free
• Not a single exclusive lock
• Writer thread can always make progress
• Optimistic locking (retry-logic) in a few places for searcher thread
• Retry logic very simple and guaranteed to always make progress

In-memory Real-time Index
• Highly optimized for GC - all data is stored in blocked native arrays
• v1: Optimized for tweets with a term position limit of 255
• v2: Support for 32 bit positions without performance degradation
• v2: Basic support for out-of-order posting list inserts

• RT term dictionary
• Term lookups using a lock-free hashtable in O(1)
• v2: Additional probabilistic, lock-free skip list maintains ordering on terms
• Perfect skip list not an option: out-of-order inserts would require
rebalancing, which is impractical with our lock-free index
• In a probabilistic skip list the tower height of a new (out-of-order) item can
be determined without knowing its insert position by simply rolling a dice

• Perfect skip list

• Perfect skip list
Inserting a new element in the middle of this
skip list requires re-balancing the towers.

• Probabilistic skip list

• Probabilistic skip list Tower height determined by rolling a dice
BEFORE knowing the insert location; tower height
never has to change for an element, simplifying
memory allocation and concurrency.

Schema-based Document factory
• Apps provide one ThriftSchema per index and create a ThriftDocument for
each document
• SchemaDocumentFactory translates ThriftDocument -> Lucene Document
using the Schema
• Default field values
• Extended field settings
• Type-system on top of DocValues
• Validation

Schema
Lucene
Document
SchemaDocument
Factory
Thrift
Document
• Validation
• Fill in default values
• Apply correct Lucene
field settings

Schema
Lucene
Document
SchemaDocument
Factory
Thrift
Document
• Validation
• Fill in default values
• Apply correct Lucene
field settings
Decouples core package from
specific product/index. Similar
to Solr/ElasticSearch.

Search @twitter
Agenda
- Introduction
- Lucene Extensions
‣ Outlook

Outlook
• Support for parallel (sliced) segments to support partial segment rebuilds
and other cool posting list update patterns
• Add remaining missing Lucene features to RT index
• Index term statistics for ranking
• Term vectors
• Stored fields

Questions?
Michael Busch
@michibusch
michael@twitter.com
buschmi@apache.org

Searching for top entities within Tweets
• Task: Find the best photos in a subset of tweets
• We could use a Lucene index, where each photo is a document
• Problem: How to update existing documents when the same photos are
tweeted again?
• In-place posting list updates are hard
• Lucene’s updateDocument() is a delete/add operation - expensive and not
order-preserving

• Task: Find the best photos in a subset of tweets
• Could we use our existing time-ordered tweet index?
• Facets!

Query Doc ids
Inverted
index
Term id Term label
Forward
Doc id index Document
Metadata
Facet
index
Doc id Term ids

Storing tweet metadata
Facet
Doc id index Term ids

5 15 9000 9002 100000 100090
Matching
doc id
Facet
index
Term ids
Top-k heap
Id Count
48239 8
31241 2
Query

5 15 9000 9002 100000 100090
Matching
doc id
Facet
index
Term ids
Top-k heap
Id Count
48239 15
31241 12
85932 8
6748 3
Query

5 15 9000 9002 100000 100090
Matching
doc id
Facet
index
Term ids
Top-k heap
Id Count
48239 15
31241 12
85932 8
6748 3
Query
Weighted counts (from
engagement features) used
for relevance scoring

5 15 9000 9002 100000 100090
Matching
doc id
Facet
index
Term ids
Top-k heap
Id Count
48239 15
31241 12
85932 8
6748 3
Query
All query operators can be
used. E.g. find best photos in
San Francisco tweeted by
people I follow

Inverted
Term id index Term label

Id Count Label Count
pic.twitter.com/jknui4w 45
pic.twitter.com/dslkfj83 23
pic.twitter.com/acm3ps 15
pic.twitter.com/948jdsd 11
pic.twitter.com/dsjkf15h 8
pic.twitter.com/irnsoa32 5
48239 45
31241 23
85932 15
6748 11
74294 8
3728 5
Inverted
index

Summary
• Indexing tweet entities (e.g. photos) as facets allows to search and rank top-entities
using a tweets index
• All query operators supported
• Documents don’t need to be reindexed
• Approach reusable for different use cases, e.g.: best vines, hashtags,
@mentions, etc.

Search at Twitter: Presented by Michael Busch, Twitter

Related slideshows

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Search at Twitter: Presented by Michael Busch, Twitter

Similar to Search at Twitter: Presented by Michael Busch, Twitter (20)

More from Lucidworks

More from Lucidworks (20)

Recently uploaded

Recently uploaded (20)

Search at Twitter: Presented by Michael Busch, Twitter