SlideShare a Scribd company logo
Making Session Stores
More Intelligent
What is a session store?
• An chunk of data that is connected to one “user” of a service
–”user” can be a simple visitor
–or proper user with an account
• Often persisted between client and server by a token in a cookie*
–Cookie is given by server, stored by browser
–Client sends that cookie back to the server on subsequent requests
–Server associates that token with data
• Often the most frequently used data by that user
–Data that is specific to the user
–Data that is required for rendering or common use
• Often ephemeral and duplicated
A session store is…
Session Storage Uses Cases
• Username
• Preferences
• Name
• “Stateful” data
• Traditional +
• Notifications
• Past behaviour
–content surfacing
–analytical information
In a simple world
Internet Server Database
Good problems
Internet Server Database
Traffic Grows… Struggles
Good solution
Internet Server Database
performance restored
storage on the
More good problems
Internet Server Database
storage on the
Problematic Solutions
Internet Server Database
storage on the
Load balanced
storage on the
Multiple Servers + On-server Sessions?
Server DatabaseRobin
Server #1 – Hello Robin!
Multiple Servers + On-server Sessions?
Server DatabaseRobin
Server #3 – Hello ????
Better solution
Internet Server Database
Load balanced
Session Storage
What is Redis?
Who We Are
Open source. The leading in-memory database platform,
supporting any high performance operational, analytics or
hybrid use case.
The open source home and commercial provider of Redis
Enterprise technology, platform, products & services.
Redis Top Differentiators
Simplicity ExtensibilityPerformance
NoSQL Benchmark
Redis Data Structures
2 3
Redis Modules
Bit field
Sorted Sets
Geospatial Indexes
Performance: The Most Powerful Database
Highest Throughput at Lowest Latency
in High Volume of Writes Scenario
Least Servers Needed to
Deliver 1 Million Writes/Sec
Benchmarks performed by Avalon Consulting Group Benchmarks published in the Google blog
Simplicity: Data Structures - Redis’ Building Blocks
[ A → B → C → D → E ]
{ A: “foo”, B: “bar”, C: “baz” }
"I'm a Plain Text String!”
Bit field
”Retrieve the e-mail address of the user with the highest
bid in an auction that started on July 24th at 11:00pm PST” ZREVRANGE 07242015_2300 0 0=
{id1=time1.seq1(A:“xyz”, B:“cdf”),
d2=time2.seq2(D:“abc”, )}
00110101 11001110
Sorted Sets
{ A: 0.1, B: 0.3, C: 100 }
{ A , B , C , D , E }
Geospatial Indexes
{ A: (51.5, 0.12), B: (32.1, 34.7) }
• Add-ons that use a Redis API to seamlessly support additional
use cases and data structures.
• Enjoy Redis’ simplicity, super high performance, infinite
scalability and high availability.
Extensibility: Modules Extend Redis Infinitely
• Any C/C++/Go program can become a Module and run on Redis.
• Leverage existing data structures or introduce new ones.
• Can be used by anyone; Redis Enterprise Modules are tested and certified by Redis
• Turn Redis into a Multi-Model database
Redise Pack Managed
Fully managed Redise
Pack in private
Redise Pack
Downloadable Redise
software for any
enterprise datacenter
or cloud environment
Redise Cloud Private
Fully managed, server-
less scaling Redise
service in VPCs within
AWS, MS Azure, GCP
and IBM Softlayer
Redise Cloud
Fully managed, server-
less Redise service on
hosted resources
within AWS, MS Azure,
GCP, IBM Softlayer,
Heroku, CF and
Redis Labs Products
or or or
DBaaS Software
• Probabilistic data structure
• Hash -> sample bits -> set bits
• Properties:
–False negatives – not possible
–False positives – possible, but controllable
–Bits per item stored
–Add or check if exists
–Like the Tardis, it’s bigger on the inside than outside
• Availability:
–Redis Module
–On top of bitfields
Concept: Bloom Filters (presence)
• Probabilistic data structure
• Hash -> count runs -> store runs
• Properties:
–Estimates unique items
–Bits per item stored – 264 unique items in 12kb /
error rate 0.81%
–Add, count or merge!
–Like the Tardis, it’s bigger on the inside than
• Availability:
–All versions of Redis
Concept: HyperLogLog (cardinality)
• It’s just bits!
• Fixed starting point, each point
represents a moment in time, flip to
represent activity
• Properties:
–Size relative to length of time (byte round)
–Count totals or ranges
• Availability:
–All versions of Redis
Concept: Bit counting (time series)
Group Notifications
• Group of users get notification “Sale on sweaters”
• Insert into central table of notifications
• Insert row in table with each user of group with notification and seen flag
• Each time it is needed, query notifications table where seen flag is false.
Traditional Group Notification Pattern
Traditional Group Notification Pattern
• Adding/removing means touching a row for each user in group.
–Fine for groups of 10 users, what about 1 million?
–Also multi-step
• Storage is proportional to size of group and notifications
• Constant DB hits, not easily cacheable
• Setting “read” is DB write
• Add notification to single group based structure or table (easily cacheable)
• First n notifications are read by all users in group.
• The notifications are checked to see if they are in a session-based Bloom filter or not.
• Mark read by adding to Bloom filter in session store.
Modern & Intelligent Group Notification Pattern
Modern & Intelligent Group Notification Pattern
• Adding a notification only writes to a single table, single row.
• Model fits use – unread assumed.
• Fast. Checking for read / writing read is unrelated to number of items in the filter.
• ~5-bits per item, but Bloom filter doesn’t always grow.
• Gentle scaling
Notification #1
Notification #2
Notification #3
Notification #4
Notification #5
Notification #6
Notification #1
Notification #2
Notification #3
Notification #4
Notification #5
Notification #6
Fresh Content
• Hand pick and rotate a small number of
• Stored in DB table
• Served out dumbly to users
Traditional Content Surfacing Pattern (Basic)
• May serve content multiple times
• Freshness is linked to a manual curatorial
Traditional Content Surfacing Pattern (Advanced)
• Batch process builds content list to surface
for each user
• List is stored in DB Table
• Served out to user
• Rotated on a schedule
• Not Real-time
• May serve content multiple times
• Un-cacheable DB content
• Hard to scale
• Middleware adds each content read to a
Bloom filter stored in the session
• Featured content list is built, can be
• Featured items are checked vs Bloom filter
Modern & Intelligent Content Surfacing Pattern
• No DB hits for user
• Featured content is cacheable
• Will not to show content multiple times if
• Tiny storage requirements even at scale
• Freshness can be achieved with zero/low
human input
• Real-time recording of activity –
immediate impact on fresh content
Content #1
Content #2
Content #3
Content #4
Content #5
Content #6
Content #1
Content #1
Content #3
Content #5
Activity Pattern Monitoring & Personalization
• Monitor the usage behaviour
–Content viewed
–Activity over time
–Combinations of content history and activity
• Personalize the content based on the behaviour
• Seen as difficult to accomplish
–Analytics data
• Stored in another service
• Anonymized
–Complicated graph or ML based solutions
• Inferences
• Black boxes
Activity Pattern Monitoring & Personalization?
• Record site activity with bit counting
• Unique page views in HyperLogLog
• Leverage the page visit Bloom filter
• Simpler counter for pages consumed
• Create criteria based on session stored analytics
–New to a page? Bloom filter
–New to the site? Unique Page view = 1 (HLL) && Previously Visited = false (Bloom)
–Inactive user? Sum the bit count over the last five records, if = 0 then inactive
–Been to a cluster of pages (infer interest)? Check cluster of pages vs Bloom filter – combo!
Activity Pattern Monitoring & Personalization
• Why is this suddenly possible?
–Probabilistic data structures are small/fast
–Bit counting is small/fast
–Decoupled from operational database
• What about privacy?
–Legitimate concern
–Non-reversible probabilistic structures
–Siloed from rest of database
Activity Pattern Monitoring & Personalization
Thank you!

More Related Content

What's hot

Frontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling frameworkFrontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling framework
Migration from SQL to MongoDB - A Case Study at
Migration from SQL to MongoDB - A Case Study at Migration from SQL to MongoDB - A Case Study at
Migration from SQL to MongoDB - A Case Study at
How companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseHow companies use NoSQL and Couchbase
How companies use NoSQL and Couchbase
Dipti Borkar
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Bob Pusateri
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for Developers
Case studies session 2
Case studies   session 2Case studies   session 2
Case studies session 2
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
Webinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDBWebinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDB
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
Anahit Pogosova
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark Presentation
Stephen Borg
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresHBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
Cloudera, Inc.
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
Krishna Gade

What's hot (20)

Frontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling frameworkFrontera: open source, large scale web crawling framework
Frontera: open source, large scale web crawling framework
Migration from SQL to MongoDB - A Case Study at
Migration from SQL to MongoDB - A Case Study at Migration from SQL to MongoDB - A Case Study at
Migration from SQL to MongoDB - A Case Study at
How companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseHow companies use NoSQL and Couchbase
How companies use NoSQL and Couchbase
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Prepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDBPrepare for Peak Holiday Season with MongoDB
Prepare for Peak Holiday Season with MongoDB
MongoDB Operations for Developers
MongoDB Operations for DevelopersMongoDB Operations for Developers
MongoDB Operations for Developers
Case studies session 2
Case studies   session 2Case studies   session 2
Case studies session 2
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDBMongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
MongoDB Days Silicon Valley: Best Practices for Upgrading to MongoDB
What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!What's the Scoop on Hadoop? How It Works and How to WORK IT!
What's the Scoop on Hadoop? How It Works and How to WORK IT!
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
An Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media PlatformAn Elastic Metadata Store for eBay’s Media Platform
An Elastic Metadata Store for eBay’s Media Platform
Webinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDBWebinar: Technical Introduction to Native Encryption on MongoDB
Webinar: Technical Introduction to Native Encryption on MongoDB
AWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual MeetupAWS Community Nordics Virtual Meetup
AWS Community Nordics Virtual Meetup
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark Presentation
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index StructuresHBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
HBaseCon 2013: HBase SEP - Reliable Maintenance of Auxiliary Index Structures
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
 Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Webinar: “ditch Oracle NOW”: Best Practices for Migrating to MongoDB
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest

Similar to Making Session Stores More Intelligent

Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
DataWorks Summit
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it Fast
Barry Jones
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live ConnectTableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Remy Rosenbaum
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Jon Peck
SharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceSharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 Performance
Brian Culver
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
Sri Ambati
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
Zohar Elkayam
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Brian Culver
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Petter Skodvin-Hvammen
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
DataWorks Summit
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
David Phillips
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Radenko Zec
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUG
Jon Peck
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View ThresholdSPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
Ben Steinhauser
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Open Analytics
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
Christopher Whitaker

Similar to Making Session Stores More Intelligent (20)

Enabling real interactive BI on Hadoop
Enabling real interactive BI on HadoopEnabling real interactive BI on Hadoop
Enabling real interactive BI on Hadoop
Day 7 - Make it Fast
Day 7 - Make it FastDay 7 - Make it Fast
Day 7 - Make it Fast
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live ConnectTableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Tableau on Hadoop Meet Up: Advancing from Extracts to Live Connect
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
SharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceSharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 Performance
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Presto: Fast SQL on Everything
Presto: Fast SQL on EverythingPresto: Fast SQL on Everything
Presto: Fast SQL on Everything
Webinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with FusionWebinar: Faster Log Indexing with Fusion
Webinar: Faster Log Indexing with Fusion
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUG
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View ThresholdSPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
SPSNYC 2016 - Big data in SharePoint and the 5,000 Item List View Threshold
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen

Recently uploaded

Yury Chemerkin
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
Alison B. Lowndes
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Priyanka Aash
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
Priyanka Aash
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Alliance
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
Priyanka Aash
Yury Chemerkin
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
Razin Mustafiz
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptxFIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Alliance
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceCracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Quentin Reul
Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
Priyanka Aash

Recently uploaded (20)

NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
Demystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity ApplicationsDemystifying Neural Networks And Building Cybersecurity Applications
Demystifying Neural Networks And Building Cybersecurity Applications
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptxFIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
FIDO Munich Seminar Blueprint for In-Vehicle Payment Standard.pptx
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceCracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security

Making Session Stores More Intelligent

  • 1. Making Session Stores More Intelligent KYLE J. DAVIS TECHNICAL MARKETING MANAGER REDIS LABS
  • 2. What is a session store?
  • 3. • An chunk of data that is connected to one “user” of a service –”user” can be a simple visitor –or proper user with an account • Often persisted between client and server by a token in a cookie* –Cookie is given by server, stored by browser –Client sends that cookie back to the server on subsequent requests –Server associates that token with data • Often the most frequently used data by that user –Data that is specific to the user –Data that is required for rendering or common use • Often ephemeral and duplicated A session store is…
  • 4. Session Storage Uses Cases Traditional • Username • Preferences • Name • “Stateful” data Intelligent • Traditional + • Notifications • Past behaviour –content surfacing –analytical information –personalization
  • 5. In a simple world Internet Server Database
  • 6. Good problems Internet Server Database Traffic Grows… Struggles
  • 7. Good solution Internet Server Database performance restored Session storage on the server
  • 8. More good problems Internet Server Database Session storage on the server Struggling
  • 9. Problematic Solutions Internet Server Database Session storage on the server Load balanced Session storage on the server
  • 10. Multiple Servers + On-server Sessions? Server DatabaseRobin Server #1 – Hello Robin!
  • 11. Multiple Servers + On-server Sessions? Server DatabaseRobin Server #3 – Hello ????
  • 12. Better solution Internet Server Database Load balanced Redis Session Storage
  • 14. Who We Are Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and commercial provider of Redis Enterprise technology, platform, products & services. 14
  • 15. Redis Top Differentiators Simplicity ExtensibilityPerformance NoSQL Benchmark 1 Redis Data Structures 2 3 Redis Modules 15 Lists Hashes Bitmaps Strings Bit field Streams Hyperloglog Sorted Sets Sets Geospatial Indexes
  • 16. Performance: The Most Powerful Database Highest Throughput at Lowest Latency in High Volume of Writes Scenario Least Servers Needed to Deliver 1 Million Writes/Sec Benchmarks performed by Avalon Consulting Group Benchmarks published in the Google blog 16 1 Serversusedtoachieve1Mwrites/sec
  • 17. Simplicity: Data Structures - Redis’ Building Blocks Lists [ A → B → C → D → E ] Hashes { A: “foo”, B: “bar”, C: “baz” } Bitmaps 0011010101100111001010 Strings "I'm a Plain Text String!�� Bit field {23334}{112345569}{766538} Key 17 2 ”Retrieve the e-mail address of the user with the highest bid in an auction that started on July 24th at 11:00pm PST” ZREVRANGE 07242015_2300 0 0= Streams {id1=time1.seq1(A:“xyz”, B:“cdf”), d2=time2.seq2(D:“abc”, )} Hyperloglog 00110101 11001110 Sorted Sets { A: 0.1, B: 0.3, C: 100 } Sets { A , B , C , D , E } Geospatial Indexes { A: (51.5, 0.12), B: (32.1, 34.7) }
  • 18. • Add-ons that use a Redis API to seamlessly support additional use cases and data structures. • Enjoy Redis’ simplicity, super high performance, infinite scalability and high availability. Extensibility: Modules Extend Redis Infinitely • Any C/C++/Go program can become a Module and run on Redis. • Leverage existing data structures or introduce new ones. • Can be used by anyone; Redis Enterprise Modules are tested and certified by Redis Labs. • Turn Redis into a Multi-Model database 18 3
  • 19. Redise Pack Managed Fully managed Redise Pack in private datacenters Redise Pack Downloadable Redise software for any enterprise datacenter or cloud environment Redise Cloud Private Fully managed, server- less scaling Redise service in VPCs within AWS, MS Azure, GCP and IBM Softlayer Redise Cloud Fully managed, server- less Redise service on hosted resources within AWS, MS Azure, GCP, IBM Softlayer, Heroku, CF and OpenShift Redis Labs Products 19 or or or DBaaS Software
  • 21. • Probabilistic data structure • Hash -> sample bits -> set bits • Properties: –False negatives – not possible –False positives – possible, but controllable –Bits per item stored –Add or check if exists –Like the Tardis, it’s bigger on the inside than outside • Availability: –Redis Module –On top of bitfields Concept: Bloom Filters (presence)
  • 22. • Probabilistic data structure • Hash -> count runs -> store runs • Properties: –Estimates unique items –Bits per item stored – 264 unique items in 12kb / error rate 0.81% –Add, count or merge! –Like the Tardis, it’s bigger on the inside than outside • Availability: –All versions of Redis Concept: HyperLogLog (cardinality)
  • 23. • It’s just bits! • Fixed starting point, each point represents a moment in time, flip to represent activity • Properties: –Size relative to length of time (byte round) –Count totals or ranges –BITOP (AND/XOR/OR/NOT) • Availability: –All versions of Redis Concept: Bit counting (time series)
  • 25. Process • Group of users get notification “Sale on sweaters” • Insert into central table of notifications • Insert row in table with each user of group with notification and seen flag • Each time it is needed, query notifications table where seen flag is false. Traditional Group Notification Pattern
  • 26. Traditional Group Notification Pattern Challenges • Adding/removing means touching a row for each user in group. –Fine for groups of 10 users, what about 1 million? –Also multi-step • Storage is proportional to size of group and notifications • Constant DB hits, not easily cacheable • Setting “read” is DB write
  • 27. Process • Add notification to single group based structure or table (easily cacheable) • First n notifications are read by all users in group. • The notifications are checked to see if they are in a session-based Bloom filter or not. • Mark read by adding to Bloom filter in session store. Modern & Intelligent Group Notification Pattern
  • 28. Modern & Intelligent Group Notification Pattern Advantages • Adding a notification only writes to a single table, single row. • Model fits use – unread assumed. • Fast. Checking for read / writing read is unrelated to number of items in the filter. Consistent. • ~5-bits per item, but Bloom filter doesn’t always grow. • Gentle scaling
  • 29. Visual Notification #1 Notification #2 Notification #3 Notification #4 Notification #5 Notification #6 �� ✗ ✔ ✗ ✔ ✗ Notification #1 Notification #2 Notification #3 Notification #4 Notification #5 Notification #6 ✗
  • 31. Process • Hand pick and rotate a small number of content/items • Stored in DB table • Served out dumbly to users Traditional Content Surfacing Pattern (Basic) Challenges • May serve content multiple times • Freshness is linked to a manual curatorial process
  • 32. Traditional Content Surfacing Pattern (Advanced) Process • Batch process builds content list to surface for each user • List is stored in DB Table • Served out to user • Rotated on a schedule Challenges • Not Real-time • May serve content multiple times • Un-cacheable DB content • Hard to scale
  • 33. Process • Middleware adds each content read to a Bloom filter stored in the session • Featured content list is built, can be extensive. • Featured items are checked vs Bloom filter on-the-fly Modern & Intelligent Content Surfacing Pattern Advantages • No DB hits for user • Featured content is cacheable • Will not to show content multiple times if read • Tiny storage requirements even at scale • Freshness can be achieved with zero/low human input • Real-time recording of activity – immediate impact on fresh content
  • 34. Visual Content #1 Content #2 Content #3 Content #4 Content #5 Content #6 ✔ ✗ ✔ ✗ ✔ ✗ Content #1 Content #1 ✗ Content #3 Content #5
  • 35. Activity Pattern Monitoring & Personalization
  • 36. • Monitor the usage behaviour –Content viewed –Activity over time –Combinations of content history and activity • Personalize the content based on the behaviour • Seen as difficult to accomplish –Analytics data • Stored in another service • Anonymized –Complicated graph or ML based solutions • Inferences • Black boxes Activity Pattern Monitoring & Personalization?
  • 37. • Record site activity with bit counting • Unique page views in HyperLogLog • Leverage the page visit Bloom filter • Simpler counter for pages consumed • Create criteria based on session stored analytics –New to a page? Bloom filter –New to the site? Unique Page view = 1 (HLL) && Previously Visited = false (Bloom) –Inactive user? Sum the bit count over the last five records, if = 0 then inactive –Been to a cluster of pages (infer interest)? Check cluster of pages vs Bloom filter – combo! Activity Pattern Monitoring & Personalization
  • 38. • Why is this suddenly possible? –Probabilistic data structures are small/fast –Bit counting is small/fast –Decoupled from operational database • What about privacy? –Legitimate concern –Non-reversible probabilistic structures –Siloed from rest of database Activity Pattern Monitoring & Personalization