SlideShare a Scribd company logo
Making Session Stores
More Intelligent
KYLE J. DAVIS
TECHNICAL MARKETING MANAGER
REDIS LABS
What is a session store?
• An chunk of data that is connected to one “user” of a service
–”user” can be a simple visitor
–or proper user with an account
• Often persisted between client and server by a token in a cookie*
–Cookie is given by server, stored by browser
–Client sends that cookie back to the server on subsequent requests
–Server associates that token with data
• Often the most frequently used data by that user
–Data that is specific to the user
–Data that is required for rendering or common use
• Often ephemeral and duplicated
A session store is…
Session Storage Uses Cases
Traditional
• Username
• Preferences
• Name
• “Stateful” data
Intelligent
• Traditional +
• Notifications
• Past behaviour
–content surfacing
–analytical information
–personalization
In a simple world
Internet Server Database
Good problems
Internet Server Database
Traffic Grows… Struggles
Good solution
Internet Server Database
performance restored
Session
storage on the
server
More good problems
Internet Server Database
Session
storage on the
server
Struggling
Problematic Solutions
Internet Server Database
Session
storage on the
server
Load balanced
Session
storage on the
server
Multiple Servers + On-server Sessions?
Server DatabaseRobin
Server #1 – Hello Robin!
Multiple Servers + On-server Sessions?
Server DatabaseRobin
Server #3 – Hello ????
Better solution
Internet Server Database
Load balanced
Redis
Session Storage
What is Redis?
Who We Are
Open source. The leading in-memory database platform,
supporting any high performance operational, analytics or
hybrid use case.
The open source home and commercial provider of Redis
Enterprise technology, platform, products & services.
14
Redis Top Differentiators
Simplicity ExtensibilityPerformance
NoSQL Benchmark
1
Redis Data Structures
2 3
Redis Modules
15
Lists
Hashes
Bitmaps
Strings
Bit field
Streams
Hyperloglog
Sorted Sets
Sets
Geospatial Indexes
Performance: The Most Powerful Database
Highest Throughput at Lowest Latency
in High Volume of Writes Scenario
Least Servers Needed to
Deliver 1 Million Writes/Sec
Benchmarks performed by Avalon Consulting Group Benchmarks published in the Google blog
16
1
Serversusedtoachieve1Mwrites/sec
Simplicity: Data Structures - Redis’ Building Blocks
Lists
[ A → B → C → D → E ]
Hashes
{ A: “foo”, B: “bar”, C: “baz” }
Bitmaps
0011010101100111001010
Strings
"I'm a Plain Text String!”
Bit field
{23334}{112345569}{766538}
Key
17
2
”Retrieve the e-mail address of the user with the highest
bid in an auction that started on July 24th at 11:00pm PST” ZREVRANGE 07242015_2300 0 0=
Streams
{id1=time1.seq1(A:“xyz”, B:“cdf”),
d2=time2.seq2(D:“abc”, )}
Hyperloglog
00110101 11001110
Sorted Sets
{ A: 0.1, B: 0.3, C: 100 }
Sets
{ A , B , C , D , E }
Geospatial Indexes
{ A: (51.5, 0.12), B: (32.1, 34.7) }
• Add-ons that use a Redis API to seamlessly support additional
use cases and data structures.
• Enjoy Redis’ simplicity, super high performance, infinite
scalability and high availability.
Extensibility: Modules Extend Redis Infinitely
• Any C/C++/Go program can become a Module and run on Redis.
• Leverage existing data structures or introduce new ones.
• Can be used by anyone; Redis Enterprise Modules are tested and certified by Redis
Labs.
• Turn Redis into a Multi-Model database
18
3
Redise Pack Managed
Fully managed Redise
Pack in private
datacenters
Redise Pack
Downloadable Redise
software for any
enterprise datacenter
or cloud environment
Redise Cloud Private
Fully managed, server-
less scaling Redise
service in VPCs within
AWS, MS Azure, GCP
and IBM Softlayer
Redise Cloud
Fully managed, server-
less Redise service on
hosted resources
within AWS, MS Azure,
GCP, IBM Softlayer,
Heroku, CF and
OpenShift
Redis Labs Products
19
or or or
DBaaS Software
Concepts
• Probabilistic data structure
• Hash -> sample bits -> set bits
• Properties:
–False negatives – not possible
–False positives – possible, but controllable
–Bits per item stored
–Add or check if exists
–Like the Tardis, it’s bigger on the inside than outside
• Availability:
–Redis Module
–On top of bitfields
Concept: Bloom Filters (presence)
• Probabilistic data structure
• Hash -> count runs -> store runs
• Properties:
–Estimates unique items
–Bits per item stored – 264 unique items in 12kb /
error rate 0.81%
–Add, count or merge!
–Like the Tardis, it’s bigger on the inside than
outside
• Availability:
–All versions of Redis
Concept: HyperLogLog (cardinality)
engineering.conversantmedia.com
• It’s just bits!
• Fixed starting point, each point
represents a moment in time, flip to
represent activity
• Properties:
–Size relative to length of time (byte round)
–Count totals or ranges
–BITOP (AND/XOR/OR/NOT)
• Availability:
–All versions of Redis
Concept: Bit counting (time series)
Group Notifications
Process
• Group of users get notification “Sale on sweaters”
• Insert into central table of notifications
• Insert row in table with each user of group with notification and seen flag
• Each time it is needed, query notifications table where seen flag is false.
Traditional Group Notification Pattern
Traditional Group Notification Pattern
Challenges
• Adding/removing means touching a row for each user in group.
–Fine for groups of 10 users, what about 1 million?
–Also multi-step
• Storage is proportional to size of group and notifications
• Constant DB hits, not easily cacheable
• Setting “read” is DB write
Process
• Add notification to single group based structure or table (easily cacheable)
• First n notifications are read by all users in group.
• The notifications are checked to see if they are in a session-based Bloom filter or not.
• Mark read by adding to Bloom filter in session store.
Modern & Intelligent Group Notification Pattern
Modern & Intelligent Group Notification Pattern
Advantages
• Adding a notification only writes to a single table, single row.
• Model fits use – unread assumed.
• Fast. Checking for read / writing read is unrelated to number of items in the filter.
Consistent.
• ~5-bits per item, but Bloom filter doesn’t always grow.
• Gentle scaling
Visual
Notification #1
Notification #2
Notification #3
Notification #4
Notification #5
Notification #6
✔
✗
✔
✗
✔
✗
Notification #1
Notification #2
Notification #3
Notification #4
Notification #5
Notification #6
✗
Fresh Content
Process
• Hand pick and rotate a small number of
content/items
• Stored in DB table
• Served out dumbly to users
Traditional Content Surfacing Pattern (Basic)
Challenges
• May serve content multiple times
• Freshness is linked to a manual curatorial
process
Traditional Content Surfacing Pattern (Advanced)
Process
• Batch process builds content list to surface
for each user
• List is stored in DB Table
• Served out to user
• Rotated on a schedule
Challenges
• Not Real-time
• May serve content multiple times
• Un-cacheable DB content
• Hard to scale
Process
• Middleware adds each content read to a
Bloom filter stored in the session
• Featured content list is built, can be
extensive.
• Featured items are checked vs Bloom filter
on-the-fly
Modern & Intelligent Content Surfacing Pattern
Advantages
• No DB hits for user
• Featured content is cacheable
• Will not to show content multiple times if
read
• Tiny storage requirements even at scale
• Freshness can be achieved with zero/low
human input
• Real-time recording of activity –
immediate impact on fresh content
Visual
Content #1
Content #2
Content #3
Content #4
Content #5
Content #6
✔
✗
✔
✗
✔
✗
Content #1
Content #1
✗
Content #3
Content #5
Activity Pattern Monitoring & Personalization
• Monitor the usage behaviour
–Content viewed
–Activity over time
–Combinations of content history and activity
• Personalize the content based on the behaviour
• Seen as difficult to accomplish
–Analytics data
• Stored in another service
• Anonymized
–Complicated graph or ML based solutions
• Inferences
• Black boxes
Activity Pattern Monitoring & Personalization?
• Record site activity with bit counting
• Unique page views in HyperLogLog
• Leverage the page visit Bloom filter
• Simpler counter for pages consumed
• Create criteria based on session stored analytics
–New to a page? Bloom filter
–New to the site? Unique Page view = 1 (HLL) && Previously Visited = false (Bloom)
–Inactive user? Sum the bit count over the last five records, if = 0 then inactive
–Been to a cluster of pages (infer interest)? Check cluster of pages vs Bloom filter – combo!
Activity Pattern Monitoring & Personalization
• Why is this suddenly possible?
–Probabilistic data structures are small/fast
–Bit counting is small/fast
–Decoupled from operational database
• What about privacy?
–Legitimate concern
–Non-reversible probabilistic structures
–Siloed from rest of database
Activity Pattern Monitoring & Personalization
Questions?
Thank you!

More Related Content

What's hot

Frontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling frameworkFrontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling framework
Scrapinghub
 
Migration from SQL to MongoDB - A Case Study at TheKnot.com
Migration from SQL to MongoDB - A Case Study at TheKnot.com Migration from SQL to MongoDB - A Case Study at TheKnot.com
Migration from SQL to MongoDB - A Case Study at TheKnot.com
MongoDB
 
How companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseHow companies use NoSQL and Couchbase
How companies use NoSQL and Couchbase
Dipti Borkar
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Bob Pusateri
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
MongoDB
 
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for Developers
MongoDB
 
Case studies session 2
Case studies   session 2Case studies   session 2
Case studies session 2
HBaseCon
 
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB
 
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
MongoDB
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
MongoDB
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
MongoDB
 
Webinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDBWebinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDB
MongoDB
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
Anahit Pogosova
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
MongoDB
 
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark Presentation
Stephen Borg
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
MongoDB
 
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresHBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
Cloudera, Inc.
 
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
MongoDB
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
MongoDB
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
Krishna Gade
 

What's hot (20)

Frontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling frameworkFrontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling framework
 
Migration from SQL to MongoDB - A Case Study at TheKnot.com
Migration from SQL to MongoDB - A Case Study at TheKnot.com Migration from SQL to MongoDB - A Case Study at TheKnot.com
Migration from SQL to MongoDB - A Case Study at TheKnot.com
 
How companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseHow companies use NoSQL and Couchbase
How companies use NoSQL and Couchbase
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
 
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
 
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for Developers
 
Case studies session 2
Case studies   session 2Case studies   session 2
Case studies session 2
 
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
 
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
 
Webinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDBWebinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDB
 
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
 
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark Presentation
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
 
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresHBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
 
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 

Similar to Making Session Stores More Intelligent

Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
DataWorks Summit
 
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it Fast
Barry Jones
 
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live ConnectTableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Remy Rosenbaum
 
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Jon Peck
 
SharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceSharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 Performance
Brian Culver
 
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
marc_harrison
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
Sri Ambati
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
Zohar Elkayam
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Brian Culver
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Petter Skodvin-Hvammen
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
ssuserd3a367
 
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
DataWorks Summit
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
David Phillips
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
Lucidworks
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Radenko Zec
 
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUG
Jon Peck
 
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View ThresholdSPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
Ben Steinhauser
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
Christopher Whitaker
 

Similar to Making Session Stores More Intelligent (20)

Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
 
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it Fast
 
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live ConnectTableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
 
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
 
SharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceSharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 Performance
 
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
 
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
 
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUG
 
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View ThresholdSPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 

Recently uploaded

Enterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdfEnterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdf
Yury Chemerkin
 
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
Alison B. Lowndes
 
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Priyanka Aash
 
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
Priyanka Aash
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Zilliz
 
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Alliance
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
AmandaCheung15
 
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
Priyanka Aash
 
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdfDefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
Yury Chemerkin
 
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
ZachWylie3
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
Fwdays
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
Razin Mustafiz
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Zilliz
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc
 
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptxFIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Alliance
 
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceCracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Quentin Reul
 
Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
Priyanka Aash
 

Recently uploaded (20)

Enterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdfEnterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdf
 
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
 
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
 
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
 
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
 
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
 
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdfDefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
 
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
 
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptxFIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
 
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceCracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
 
Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
 

Making Session Stores More Intelligent

  • 1. Making Session Stores More Intelligent KYLE J. DAVIS TECHNICAL MARKETING MANAGER REDIS LABS
  • 2. What is a session store?
  • 3. • An chunk of data that is connected to one “user” of a service –”user” can be a simple visitor –or proper user with an account • Often persisted between client and server by a token in a cookie* –Cookie is given by server, stored by browser –Client sends that cookie back to the server on subsequent requests –Server associates that token with data • Often the most frequently used data by that user –Data that is specific to the user –Data that is required for rendering or common use • Often ephemeral and duplicated A session store is…
  • 4. Session Storage Uses Cases Traditional • Username • Preferences • Name • “Stateful” data Intelligent • Traditional + • Notifications • Past behaviour –content surfacing –analytical information –personalization
  • 5. In a simple world Internet Server Database
  • 6. Good problems Internet Server Database Traffic Grows… Struggles
  • 7. Good solution Internet Server Database performance restored Session storage on the server
  • 8. More good problems Internet Server Database Session storage on the server Struggling
  • 9. Problematic Solutions Internet Server Database Session storage on the server Load balanced Session storage on the server
  • 10. Multiple Servers + On-server Sessions? Server DatabaseRobin Server #1 – Hello Robin!
  • 11. Multiple Servers + On-server Sessions? Server DatabaseRobin Server #3 – Hello ????
  • 12. Better solution Internet Server Database Load balanced Redis Session Storage
  • 14. Who We Are Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and commercial provider of Redis Enterprise technology, platform, products & services. 14
  • 15. Redis Top Differentiators Simplicity ExtensibilityPerformance NoSQL Benchmark 1 Redis Data Structures 2 3 Redis Modules 15 Lists Hashes Bitmaps Strings Bit field Streams Hyperloglog Sorted Sets Sets Geospatial Indexes
  • 16. Performance: The Most Powerful Database Highest Throughput at Lowest Latency in High Volume of Writes Scenario Least Servers Needed to Deliver 1 Million Writes/Sec Benchmarks performed by Avalon Consulting Group Benchmarks published in the Google blog 16 1 Serversusedtoachieve1Mwrites/sec
  • 17. Simplicity: Data Structures - Redis’ Building Blocks Lists [ A → B → C → D → E ] Hashes { A: “foo”, B: “bar”, C: “baz” } Bitmaps 0011010101100111001010 Strings "I'm a Plain Text String!�� Bit field {23334}{112345569}{766538} Key 17 2 ”Retrieve the e-mail address of the user with the highest bid in an auction that started on July 24th at 11:00pm PST” ZREVRANGE 07242015_2300 0 0= Streams {id1=time1.seq1(A:“xyz”, B:“cdf”), d2=time2.seq2(D:“abc”, )} Hyperloglog 00110101 11001110 Sorted Sets { A: 0.1, B: 0.3, C: 100 } Sets { A , B , C , D , E } Geospatial Indexes { A: (51.5, 0.12), B: (32.1, 34.7) }
  • 18. • Add-ons that use a Redis API to seamlessly support additional use cases and data structures. • Enjoy Redis’ simplicity, super high performance, infinite scalability and high availability. Extensibility: Modules Extend Redis Infinitely • Any C/C++/Go program can become a Module and run on Redis. • Leverage existing data structures or introduce new ones. • Can be used by anyone; Redis Enterprise Modules are tested and certified by Redis Labs. • Turn Redis into a Multi-Model database 18 3
  • 19. Redise Pack Managed Fully managed Redise Pack in private datacenters Redise Pack Downloadable Redise software for any enterprise datacenter or cloud environment Redise Cloud Private Fully managed, server- less scaling Redise service in VPCs within AWS, MS Azure, GCP and IBM Softlayer Redise Cloud Fully managed, server- less Redise service on hosted resources within AWS, MS Azure, GCP, IBM Softlayer, Heroku, CF and OpenShift Redis Labs Products 19 or or or DBaaS Software
  • 21. • Probabilistic data structure • Hash -> sample bits -> set bits • Properties: –False negatives – not possible –False positives – possible, but controllable –Bits per item stored –Add or check if exists –Like the Tardis, it’s bigger on the inside than outside • Availability: –Redis Module –On top of bitfields Concept: Bloom Filters (presence)
  • 22. • Probabilistic data structure • Hash -> count runs -> store runs • Properties: –Estimates unique items –Bits per item stored – 264 unique items in 12kb / error rate 0.81% –Add, count or merge! –Like the Tardis, it’s bigger on the inside than outside • Availability: –All versions of Redis Concept: HyperLogLog (cardinality) engineering.conversantmedia.com
  • 23. • It’s just bits! • Fixed starting point, each point represents a moment in time, flip to represent activity • Properties: –Size relative to length of time (byte round) –Count totals or ranges –BITOP (AND/XOR/OR/NOT) • Availability: –All versions of Redis Concept: Bit counting (time series)
  • 25. Process • Group of users get notification “Sale on sweaters” • Insert into central table of notifications • Insert row in table with each user of group with notification and seen flag • Each time it is needed, query notifications table where seen flag is false. Traditional Group Notification Pattern
  • 26. Traditional Group Notification Pattern Challenges • Adding/removing means touching a row for each user in group. –Fine for groups of 10 users, what about 1 million? –Also multi-step • Storage is proportional to size of group and notifications • Constant DB hits, not easily cacheable • Setting “read” is DB write
  • 27. Process • Add notification to single group based structure or table (easily cacheable) • First n notifications are read by all users in group. • The notifications are checked to see if they are in a session-based Bloom filter or not. • Mark read by adding to Bloom filter in session store. Modern & Intelligent Group Notification Pattern
  • 28. Modern & Intelligent Group Notification Pattern Advantages • Adding a notification only writes to a single table, single row. • Model fits use – unread assumed. • Fast. Checking for read / writing read is unrelated to number of items in the filter. Consistent. • ~5-bits per item, but Bloom filter doesn’t always grow. • Gentle scaling
  • 29. Visual Notification #1 Notification #2 Notification #3 Notification #4 Notification #5 Notification #6 �� ✗ ✔ ✗ ✔ ✗ Notification #1 Notification #2 Notification #3 Notification #4 Notification #5 Notification #6 ✗
  • 31. Process • Hand pick and rotate a small number of content/items • Stored in DB table • Served out dumbly to users Traditional Content Surfacing Pattern (Basic) Challenges • May serve content multiple times • Freshness is linked to a manual curatorial process
  • 32. Traditional Content Surfacing Pattern (Advanced) Process • Batch process builds content list to surface for each user • List is stored in DB Table • Served out to user • Rotated on a schedule Challenges • Not Real-time • May serve content multiple times • Un-cacheable DB content • Hard to scale
  • 33. Process • Middleware adds each content read to a Bloom filter stored in the session • Featured content list is built, can be extensive. • Featured items are checked vs Bloom filter on-the-fly Modern & Intelligent Content Surfacing Pattern Advantages • No DB hits for user • Featured content is cacheable • Will not to show content multiple times if read • Tiny storage requirements even at scale • Freshness can be achieved with zero/low human input • Real-time recording of activity – immediate impact on fresh content
  • 34. Visual Content #1 Content #2 Content #3 Content #4 Content #5 Content #6 ✔ ✗ ✔ ✗ ✔ ✗ Content #1 Content #1 ✗ Content #3 Content #5
  • 35. Activity Pattern Monitoring & Personalization
  • 36. • Monitor the usage behaviour –Content viewed –Activity over time –Combinations of content history and activity • Personalize the content based on the behaviour • Seen as difficult to accomplish –Analytics data • Stored in another service • Anonymized –Complicated graph or ML based solutions • Inferences • Black boxes Activity Pattern Monitoring & Personalization?
  • 37. • Record site activity with bit counting • Unique page views in HyperLogLog • Leverage the page visit Bloom filter • Simpler counter for pages consumed • Create criteria based on session stored analytics –New to a page? Bloom filter –New to the site? Unique Page view = 1 (HLL) && Previously Visited = false (Bloom) –Inactive user? Sum the bit count over the last five records, if = 0 then inactive –Been to a cluster of pages (infer interest)? Check cluster of pages vs Bloom filter – combo! Activity Pattern Monitoring & Personalization
  • 38. • Why is this suddenly possible? –Probabilistic data structures are small/fast –Bit counting is small/fast –Decoupled from operational database • What about privacy? –Legitimate concern –Non-reversible probabilistic structures –Siloed from rest of database Activity Pattern Monitoring & Personalization