SlideShare a Scribd company logo
Boston Machine Learning
Algorithm design, user experience, and system architecture
June 2018
James Kirk
Tools for
41 - 53
Tools for building systems
Anatomy of
3 - 19
System components and
54 - 58
What makes a good
recommender system?
What We
59 - 63
Other subjects in
recommender systems
20 - 31
Design considerations and
32 - 40
Real-world recommender
systems and their
Table of
Anatomy of
A recommendation system presents items to users in
a relevant way.
The definition of relevant is product/context-specific.
Recommendation vs Personalization
A personalization system presents recommendations
in a way that is relevant to the individual user.
The user expects their experience to change based
on their interactions with the system.
Relevance can still be product/context specific.
A user in a recommender system is the party that is
receiving and acting on the recommendations.
Sometimes the user is the context, not an actual
Users vs Items
An item in a recommender system is the passive party
that is being recommended to the users.
The line between these two can be blurry.
Rec Sys #1
Users: Consultants*
Items: Projects
Recommend projects for
the consultant to bid on.
Rec Sys #2
Users: Projects
Items: Consultants
Recommend the right
consultant for the project.
Rec Sys #3
Users: Enterprises*
Items: Consultants
Recommend consultants
for relationship building.
Hearts, stars, likes, listens, watches, follows,
bids, purchases, hires, reads, views, upvotes…
Bans, skips, angry-face-reacts, 1-star reviews,
rejections, unfollows, returns, downvotes…
Explicit vs Implicit
Explicit actions are those that a user expects
or intends to impact their personalized
Implicit actions are all other interactions
between users and items.
User 1
User 2
User 3
User 4
Item 1 Item 2 Item 3 Item 4 Item 5 Item 6
Indicator Features
A feature that is unique to every user/item to
allow for direct personalization.
These features allow recommender systems
to learn about every user individually without
being diluted through metadata.
Often one-hot encoded user IDs or just an
identity matrix.
Metadata Features
Age, location, language, tags, labels, word
counts, pre-learned embeddings…
Everything that is known about a user/item
before training can be a feature if properly
structured. Should it be?
Often called “side input” or “shared features.”
User/Item Features
User/Item Features
Indicator Features Metadata Features
[n_users x n_user_features]
[n_items x n_item_features]
User 1
User 2
User 3
User 4
User 5
User 6
A (typically) low-dimensional vector that
encodes the feature information about the
user or item.
Often called “embedding,” “latent user/item,”
or “latent representation.”
Representation size, which is the dimension of
the latent space, is often referred to as
Representation Functions
Representation Function
The process that converts user/item features
in to representations.
Learning happens here.
Common examples:
1. Matrix factorization
2. Linear kernels
3. Deep nets
4. Word2Vec
5. Autoencoders
6. None! (Pass-through)
Representation Functions
Image: Eric Nyquist
A prediction from a recommender system is an
estimate of an item’s relevance to the user.
Predictions can be ranked for relevance.
The predictions are an indirect approximation
of the interactions.
Prediction Functions
Prediction Function
The process that converts user/item
representations in to predictions.
Common examples:
1. Dot product
2. Cosine similarity/distance
3. Euclidean similarity/distance
4. Manhattan similarity/distance*
Some systems use deep nets for prediction,
and this can be an assumption-breaker.
*Actually, Manhattan is rare
Prediction Functions
2-Component Latent Representation Space
Common examples:
1. Dot product = User · Item
2. Cosine similarity = cos(Θ)
3. Euclidean similarity* = ( -1 * δ )
4. Manhattan similarity = ( -1 * |User - Item| )
*There are many methods for expressing euclidean similarity
Loss Function
The process that converts predictions and
interactions in to error for learning.
Common examples:
1. Root-mean-square error (RMSE)
2. Kullback-Leibler divergence (KLD)
3. Alternating least squares* (ALS)
4. Bayesian personalized ranking* (BPR)
5. Weighted approximately ranked
pairwise (WARP)
6. Weighted margin-rank batch (WMRB)
*These are both a loss and representation function
Loss and Learning
Some loss functions learn to approximate
the values in the interactions matrix.
Other loss functions learn to uprank positive
interactions and downrank negative
interactions (and/or non-interacted items) for
that user.
This second category of loss functions are
called learning-to-rank.
User Features
Item Features
User Representation
Item Representation
User Representation
Item Representation
Predicted Scores Predicted Ranks
Training Loss
Loss Function
Output Data
Y = Prediction
p = Prediction function
r = Representation function
X = Features
Ɛ = Loss
s = Loss Function
N = Interactions
Interactions Features Learning
What are our
interaction values?
We must select interaction values based on
what data is available, how meaningful that
data is, and how it interacts with the rest of the
❏ What user behaviors do our interactions
❏ Explicit vs implicit?
❏ Do we allow for negative interactions?
❏ How dense are our interactions?
❏ Can our recommender handle these
How does our system
We must select representation functions that are
appropriate for our features as well as a
prediction function and loss function that will
learn effectively from this data.
❏ What representation functions will best
encode the user/item features?
❏ What prediction function will best estimate
❏ What loss function will learn from our data
most effectively?
❏ Do these choices scale?
What are our user/item
We must select user/item features from the
data available, ensure that the data is
meaningful to the recommender system, and
ensure that our use of this data is appropriate.
❏ Do we use indicator features?
❏ What useful metadata is available?
❏ Does the metadata require feature
❏ Do users expect this metadata to impact
their recommendations?
What user behaviors do our
interactions represent?
Interaction values should be an
approximation of the intended effect of the
recommender system on user behavior.
If we want people to purchase, our
interactions should be related to purchases.
If we want people to binge episodes of
shows for longer, our interactions should be
related to the act of binging.
What are our interaction values?
Explicit vs Implicit
When the user gave you this signal, did they
intend/expect it to alter their
Some explicit signals don’t work well as
Negative explicit signals should be handled
with simple product logic.
“You might give five stars to Hotel Rwanda and two
stars to Captain America, but you’re much more likely to
watch Captain America.”
-Todd Yellin, Netflix, You May Also Like
What are our interaction values?
Explicit vs Implicit
Does the user know we are using this signal for
Does the user care we are using this signal for
Is it ethical for us to use this signal for
1. Positive Positive Positive
2. Positive Positive Positive
3. No-int Negative No-int
4. No-int Negative Negative
5. No-int Negative No-int
6. No-int No-int Negative
7. Negative No-int No-int
8. Negative No-int Negative
9. Negative No-int No-int
Confusing?Do we allow negative
Negative interactions can be valuable
statements of what content to avoid.
Negative interactions can be confusing
when learning-to-rank.
Not all loss functions accommodate negative
What are our interaction values?
Which ordering is better?
Do we use indicator
Indicator features allow for powerful
personalization but are as numerous as our
Recommenders with user indicators can not
effectively make recommendations for new
users* (the cold-start problem).
Many users means many indicator features
-- this may not scale.
*Vice-versa is true for new items
What are our user/item features?
What useful metadata is
What user/item metadata do we have that is
Metadata that is useful but missing can be
requested from users, crowd-sourced, or
inferred with other ML systems.
Does the metadata require
feature engineering?
Pre-processing features can improve
recommender learning.
Some features may be useless/misleading
without feature engineering.
The choice of representation function
impacts the usefulness of feature
What are our user/item features?
Do users expect this
metadata to impact their
Is the use of this metadata ethical*?
Users can be surprised when changing
metadata impacts product experience.
*There is a distinction between metadata used in training
and metadata used in evaluation.
What representation
functions will best encode
the user/item features?
Linear kernels are effective if all we have are
indicator features or well-engineered
features. (Matrix factorization)
More complex relationships may lead us to
neural nets. How does their architecture
impact the recommender? (Use of the latent
Can the representation be learned without
interaction? (Auto-encoders, word2vec, etc)
How does our system learn?
What prediction function will
best estimate relevance?
Dot-product prediction accounts for
representation relevance and magnitude.
Cosine prediction optimizes for relevance but
has no sense for magnitude.
Euclidean prediction builds a map of items but
also has no sense for magnitude.
Should items be biased, given our choice?
What loss function will learn
from our data most
Do we want to estimate interactions, or
perform learning-to-rank?
Should the loss function accommodate
negative interactions? (RMSE, KLD…)
Should the loss function be sensitive to
interaction magnitude? (RMSE, B-WMRB…)
Tweaking the loss function can dramatically
change how recommendations feel.
How does our system learn?
Sparse vs Dense vs Sampled
Some implementations of loss functions only
account for user/item pairs with interactions.
These same loss functions can be written to
compare every possible user/item pair. These
predictions and losses are dense, and they can
be expensive.
Some of the most effective and efficient loss
functions learn by comparing pairs with
interactions against sampled pairs.* (WARP,
* There are many methods for sampling candidate pairs
Example: WMRB
WMRB approximates positive item rank
against a random sample and upranks
positive items through a hinge loss.
How does our system learn?
x = User
y = Positive item
y’ = Non-positive item
Y = All items
Z = Random sample of non-positive items
p = Prediction function
Example: Balancing WMRB
If we notice an undue popularity bias, we can
balance this by accounting for interaction
magnitudes and popularity.
How does our system learn?
x = User
y = Positive item
X = All users
p = Prediction function
n = Interaction magnitude for pair (user, item)
We can think about a recommender system
architecture as a set of top-level decisions.
When designing recommender systems, we
are evaluating the tradeoffs between these
decisions and the relationships between
these choices.
A Framework for Recommender Systems
Interactions ?
User Features ?
User Representation ?
Item Features ?
Item Representation ?
Prediction ?
Learning ?
A collaborative filter learns representations
from interactions and uses these to make
personalized recommendations, often
through matrix factorization.
Pure collaborative filters are metadata-naïve.
Example: Collaborative Filter
Interactions *
(Positive only?)
User Features Indicator
User Representation Linear
Item Features Indicator
Item Representation Linear
Prediction *
(Dot-product for MF)
Learning ALS, BPR, SVD, PCA, NMF...
A content-based recommender learns the
item features to which a user is affined.
Purely content-based systems do no transfer
learning between users.
This allows easy rec-splanation.
This requires clean item metadata.
Example: Content-based Recommender
Interactions *
User Features Indicator
User Representation Linear
Item Features Metadata
Item Representation None
(n_components = n_item_features)
Prediction Dot-product
Learning *
A hybrid recommender system learns
representations for both user and item
metadata and indicators, if available.
This opens a lot of options for us.
Example: Hybrid Recommender System
Interactions *
User Features *
User Representation *
Item Features *
Item Representation *
Prediction *
Learning *
We can build a hybrid recommender system
to recommend personalized products based
on past purchases.
Example: Purchase Recommendations
Interactions Purchases
User Features Indicator
User Representation Linear
Item Features Indicator + Metadata
Item Representation *
Prediction Dot-product
Learning *
We can use the pre-trained purchase
recommender’s representations to provide
recommendations in a new context.
In this system, the “user” is the context item,
not the person using our product.
Example: “You May Also Like” (YMAL)
Interactions X
User Features Context Item Repr
User Representation None
Item Features All Item Reprs
Item Representation None
Prediction Dot-product, Cosine?
Learning X
We can take the output of the YMAL
recommender and re-rank the items based
on the customer’s representation.
This system does not learn. The learning’s
already been done.
Example: Personalized “You May Also Like”
Interactions X
User Features User Reprs
User Representation None
Item Features Similar Item Reprs
Item Representation None
Prediction Dot-product
Learning X
Example: Personalized “You May Also Like”
Step 1:
Learn to personalize
Step 2:
Use previous learning to
calculate the most similar
Step 3:
Personalize the similar
items by re-ranking
Contextualize purchase
recommendations by
limiting the item set
Example: YouTube (Covington, Adams, Sargin)
Interactions Watches + Searches
User Features Geography, Age, Gender...
User Representation Deep net
Item Features
Pre-learned embeddings,
language, previous impressions...
Item Representation Deep net
Prediction Deep net
Learning Sampled Cross-Entropy
Tools for
Interactions *
User Features Indicator
User Representation Linear
Item Features Indicator
Item Representation Linear
Prediction Dot-product
Learning ALS, BPR
Implicit is a Python collaborative filter toolkit
that uses matrix factorization to learn
Includes factorization classes for ALS and
Made by Ben Frederickson.
MIT License
Interactions *
User Features Indicator
User Representation Linear
Item Features Indicator
Item Representation Linear
Prediction Dot-product
Learning SVD, PCA, NMF...
Scikit-learn is a Python machine learning
toolkit with many tools for feature
engineering and machine learning.
The decomposition package contains some
classes that can be used for matrix
factorization recommender systems like SVD,
Maintained by volunteers.
BSD license
Interactions *
User Features *
User Representation Linear
Item Features *
Item Representation Linear
Prediction Dot-product
Learning Logistic, BPR, WARP
LightFM is a Python hybrid recommender
system that uses matrix factorization to learn
Made by Lyst - a fashion shopping website.
Apache-2.0 license
TensorRec is a Python hybrid recommender
system framework for developing whole
recommender systems quickly.
Representation functions, prediction
functions, and loss functions can be
customized using TensorFlow or Keras.
Made by James Kirk.
Apache-2.0 license
Interactions *
User Features *
User Representation Linear, Deep nets, None...
Item Features *
Item Representation Linear, Deep nets, None...
Dot-product, Cosine,
Learning RMSE, KLD, WMRB...
Hey, that’s me
Annoy is a tool for fast similarity search
written in C++ with Python bindings.
Useful for building systems to serve
recommendations from pre-learned
Made by Spotify.
Apache-2.0 license
ANNOY (Approximate Nearest Neighbors Oh Yeah)
Interactions X
User Features X
User Representation X
Item Features X
Item Representation X
Cosine, Euclidean,
Manhattan, Hamming
Learning X
Faiss is a tool for fast similarity search
written in C++ with Python bindings.
Useful for building systems to serve
recommendations from pre-learned
Allows item biases.
Made by Facebook.
BSD license
FAISS (Facebook AI Similarity Search)
Interactions X
User Features X
User Representation X
Item Features X
Item Representation X
Prediction Dot-product, Euclidean
Learning X
NMSLib is a tool for fast similarity search
written in C++ with Python bindings.
Useful for building systems to serve
recommendations from pre-learned
Made by Bilegsaikhan Naidan, Leonid
Boytsov, Yury Malkov, David Novak, Ben
Apache-2.0 license, with some
MIT and GNU components
NMSLib (Non-Metric Space Library)
Interactions X
User Features X
User Representation X
Item Features X
Item Representation X
Prediction Cosine, Euclidean
Learning X
We can build a hybrid recommender system
to recommend personalized news articles
based on past reading.
1. We have to learn the tastes of
individual users.
2. We know users’ home location with
low resolution (country/state).
3. Articles are ephemeral. All items are
cold-start items.
4. We can vectorize article contents and
tagged categories. (politics, sports…)
5. We have to serve production-scale
user traffic.
6. We don’t have to do rec-splanation.
Example: News Article Recommendation
Interactions Clicks, page dwells...
User Features
Indicator +
vectorized locations
User Representation Linear
Item Features
TF-IDF of contents +
vectorized categories
Item Representation Deep net
Prediction Cosine
Learning Balanced WMRB
Example: News Article Recommendation
Daily Model Training
Step 1:
Vectorize historical
article contents and
Step 2:
Use vectorized article
features to learn user
representations and train
a deep net for article
Step 3:
Build Annoy indices
Step 1:
Vectorize new article
contents and metadata
Step 2:
Use trained deep net to
calculate new article
Step 3:
Rebuild Annoy indices
with the new article
Example: News Article Recommendation
Handling New Articles
Step 1:
Retrieve the user
representation from the
Step 2:
Find most relevant
articles for the user
Example: News Article Recommendation
Serving User Traffic
Example: MovieLens with TensorRec
Interactions Movie ratings
User Features Indicator
User Representation Linear
Item Features Indicator + Movie Tags
Item Representation Linear
Prediction Dot-product
Learning Balanced WMRB
Offline Evaluation
Many metrics are available for offline
evaluation to comparing predictions and
known interactions.
Most measure novelty, diversity, and
Precision@K, Recall@K, NDCG@K…
Precision@K: “What percentage of the top K
items were positively interacted?”
Recall@K: “What percentage of users’
positively interacted items were in the top K
What makes a good recommender system?
Offline Pitfalls
Many offline metrics don’t represent fairness
of performance between users or items.
These metrics can be useful for
hyperparameter optimization, but often fail to
evaluate the “feel” of recommendations.
It is hard to use offline metrics to state that
one recommender system is better than
Example: Offline Pitfalls
Three recommendation results for two users.
User 1 has 5 positive interactions.
User 2 has 2 positive interactions.
The third recommendation system is the
most broadly effective, and probably the
“best.” Precision fails to identify that, but
recall does.
You can concoct similar pitfalls for recall or
What makes a good recommender system?
1 2 1 2 1 2
P@5: 0.5 P@5: 0.5 P@5: 0.5
R@5: 0.65 R@5: 0.5 R@5: 0.8
Online Evaluation
When rolling-out a new recommender
system, the truest test is an A/B test with an
existing system.
The most effective feedback comes from
user interviewing and monitoring the user
behaviors the system is intended to drive.
If there is no existing system, do phased
roll-outs with quant/qual feedback.*
User interviewing is the only way to evaluate
the “feel” of recommendations.
*Fellow employees make great, but biased, guinea pigs
What makes a good recommender system?
“I already own a crib, why would I need
Missing item filtering based on metadata?
“These songs are excellent, but I already
know these bands.”
Maybe we should target discovery?
“I’ve watched Captain America twenty
times, but that doesn’t mean I only want to
watch Marvel movies. What about the
sitcoms I watch?”
Maybe we’re oversimplifying the user’s
All Algorithms Are Biased
There are biases innate in the data we use,
the way users interact with our products, and
the way our algorithms learn.
Controlling for this is not as simple as setting
When designing these systems, we have a
responsibility to, at the least, understand the
biases in our products.
You wouldn’t ship a product without tests.
You shouldn’t ship a RecSys without
examining bias.
Algorithmic Bias and Fairness
Understanding Fairness
There are many of definitions of fairness.
Some cross-section recommender
performance by user and item metadata.
Is recommendation recall significantly lower
for customers in Massachusetts?
Are movies with female leads recommended
less often than in the natural distribution of
movie watching?
Missing metadata? Crowdsource it, but be
careful with sensitive metadata.
What We Missed
What We Missed
Sequence-based models
In what order do our users interact with
our items?
Mixture-of-tastes models
Is one representation per user enough
for users with diverse tastes?
How do system design choices impact
Attention models
Can we learn more nuance to user
representation that just a vector?
Graphical models
Can we map relationships between
users, items, and their attributes?
Cold-start problems
How do we make recommendations for
brand-new users?
Wait, is it “recommender systems”
or “recommendation systems?”
Wait, is it “recommender systems”
or “recommendation systems?”
Thank you!
James Kirk

More Related Content

What's hot

Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Robin Reni
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
David Zibriczky
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Girish Khanzode
Recommendation system
Recommendation systemRecommendation system
Recommendation system
Akshat Thakar
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
Recommender system
Recommender systemRecommender system
Recommender system
Nilotpal Pramanik
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
D Yogendra Rao
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
Justin Basilico
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
Stanley Wang
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
Milind Gokhale
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
Crossing Minds
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
Anamta Sayyed
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Justin Basilico
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Lior Rokach
Content based filtering
Content based filteringContent based filtering
Content based filtering
Bendito Freitas Ribeiro
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
Trieu Nguyen
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems

What's hot (20)

Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Recommendation system
Recommendation systemRecommendation system
Recommendation system
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Recommender system
Recommender systemRecommender system
Recommender system
Recommender systems using collaborative filtering
Recommender systems using collaborative filteringRecommender systems using collaborative filtering
Recommender systems using collaborative filtering
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
Content based filtering
Content based filteringContent based filtering
Content based filtering
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems

Similar to Boston ML - Architecting Recommender Systems

Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
Shrutika Oswal
Colleges yvonne van_laarhoven
Colleges yvonne van_laarhovenColleges yvonne van_laarhoven
Colleges yvonne van_laarhoven
Digital Power
Modelling Personalization
Modelling PersonalizationModelling Personalization
Modelling Personalization
Bogo Vatovec
Usability in product development
Usability in product developmentUsability in product development
Usability in product development
Ravi Shyam
Towards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptxTowards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptx
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET-  	  Opinion Mining and Sentiment Analysis for Online ReviewIRJET-  	  Opinion Mining and Sentiment Analysis for Online Review
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET Journal
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
IRJET Journal
The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendation
Simon Belak
Design process design rules
Design process  design rulesDesign process  design rules
Design process design rules
Preeti Mishra
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
IRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation SystemIRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation System
IRJET Journal
Danilo Cardona
How to write use cases
How to write use casesHow to write use cases
How to write use cases
Gloria Stoilova
Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
Rachel Davis
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
IRJET Journal
Personalized recommendation for cold start users
Personalized recommendation for cold start usersPersonalized recommendation for cold start users
Personalized recommendation for cold start users
IRJET Journal
Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...
Eswar Publications
IOSR Journals
Collaborative filtering- Recommendation system
Collaborative filtering- Recommendation systemCollaborative filtering- Recommendation system
Collaborative filtering- Recommendation system
CTO Boost

Similar to Boston ML - Architecting Recommender Systems (20)

Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
Colleges yvonne van_laarhoven
Colleges yvonne van_laarhovenColleges yvonne van_laarhoven
Colleges yvonne van_laarhoven
Modelling Personalization
Modelling PersonalizationModelling Personalization
Modelling Personalization
Usability in product development
Usability in product developmentUsability in product development
Usability in product development
Towards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptxTowards Responsible AI - KC.pptx
Towards Responsible AI - KC.pptx
IRJET- Opinion Mining and Sentiment Analysis for Online Review
IRJET-  	  Opinion Mining and Sentiment Analysis for Online ReviewIRJET-  	  Opinion Mining and Sentiment Analysis for Online Review
IRJET- Opinion Mining and Sentiment Analysis for Online Review
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...A Novel Jewellery Recommendation System using Machine Learning and Natural La...
A Novel Jewellery Recommendation System using Machine Learning and Natural La...
The subtle art of recommendation
The subtle art of recommendationThe subtle art of recommendation
The subtle art of recommendation
Design process design rules
Design process  design rulesDesign process  design rules
Design process design rules
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing SlidesIntroduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
Introduction to MaxDiff Scaling of Importance - Parametric Marketing Slides
IRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation SystemIRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation System
How to write use cases
How to write use casesHow to write use cases
How to write use cases
Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Prompt-Based Techniques for Addressing the Initial Data Scarcity in Personali...
Personalized recommendation for cold start users
Personalized recommendation for cold start usersPersonalized recommendation for cold start users
Personalized recommendation for cold start users
Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...Investigation and application of Personalizing Recommender Systems based on A...
Investigation and application of Personalizing Recommender Systems based on A...
Collaborative filtering- Recommendation system
Collaborative filtering- Recommendation systemCollaborative filtering- Recommendation system
Collaborative filtering- Recommendation system

Recently uploaded

Three available editions of Windows Servers crucial to your organization’s op...
Three available editions of Windows Servers crucial to your organization’s op...Three available editions of Windows Servers crucial to your organization’s op...
Three available editions of Windows Servers crucial to your organization’s op...
Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()
Gene Gotimer
05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching
Empowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - GrawlixEmpowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - Grawlix
Aarisha Shaikh
240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf
240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf
240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf
CS Kwak
Waze vs. Google Maps vs. Apple Maps, Who Else.pdf
Waze vs. Google Maps vs. Apple Maps, Who Else.pdfWaze vs. Google Maps vs. Apple Maps, Who Else.pdf
Waze vs. Google Maps vs. Apple Maps, Who Else.pdf
Ben Ramedani
Bring Strategic Portfolio Management to using OnePlan - Webinar 18...
Bring Strategic Portfolio Management to using OnePlan - Webinar 18...Bring Strategic Portfolio Management to using OnePlan - Webinar 18...
Bring Strategic Portfolio Management to using OnePlan - Webinar 18...
OnePlan Solutions
PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)
PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)
PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)
Andre Hora
09. Ruby Object Oriented Programming - Ruby Core Teaching
09. Ruby Object Oriented Programming - Ruby Core Teaching09. Ruby Object Oriented Programming - Ruby Core Teaching
09. Ruby Object Oriented Programming - Ruby Core Teaching
Fantastic Design Patterns and Where to use them No Notes.pdf
Fantastic Design Patterns and Where to use them No Notes.pdfFantastic Design Patterns and Where to use them No Notes.pdf
Fantastic Design Patterns and Where to use them No Notes.pdf
02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Henry Schreiner
Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)
Andre Hora
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing ToolsOld Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Benjamin Bischoff
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
Shane Coughlan
CrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNewsCrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNews
Eman Nisar
Understanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdfUnderstanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdf
B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
What is Micro Frontends and Why Use it.pdf
What is Micro Frontends and Why Use it.pdfWhat is Micro Frontends and Why Use it.pdf
What is Micro Frontends and Why Use it.pdf

Recently uploaded (20)

Three available editions of Windows Servers crucial to your organization’s op...
Three available editions of Windows Servers crucial to your organization’s op...Three available editions of Windows Servers crucial to your organization’s op...
Three available editions of Windows Servers crucial to your organization’s op...
Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()Fixing Git Catastrophes - Nebraska.Code()
Fixing Git Catastrophes - Nebraska.Code()
05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching05. Ruby Control Structures - Ruby Core Teaching
05. Ruby Control Structures - Ruby Core Teaching
Empowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - GrawlixEmpowering Businesses with Intelligent Software Solutions - Grawlix
Empowering Businesses with Intelligent Software Solutions - Grawlix
240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf
240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf
240717 ProPILE - Probing Privacy Leakage in Large Language Models.pdf
Waze vs. Google Maps vs. Apple Maps, Who Else.pdf
Waze vs. Google Maps vs. Apple Maps, Who Else.pdfWaze vs. Google Maps vs. Apple Maps, Who Else.pdf
Waze vs. Google Maps vs. Apple Maps, Who Else.pdf
Bring Strategic Portfolio Management to using OnePlan - Webinar 18...
Bring Strategic Portfolio Management to using OnePlan - Webinar 18...Bring Strategic Portfolio Management to using OnePlan - Webinar 18...
Bring Strategic Portfolio Management to using OnePlan - Webinar 18...
PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)
PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)
PathSpotter: Exploring Tested Paths to Discover Missing Tests (FSE 2024)
09. Ruby Object Oriented Programming - Ruby Core Teaching
09. Ruby Object Oriented Programming - Ruby Core Teaching09. Ruby Object Oriented Programming - Ruby Core Teaching
09. Ruby Object Oriented Programming - Ruby Core Teaching
Fantastic Design Patterns and Where to use them No Notes.pdf
Fantastic Design Patterns and Where to use them No Notes.pdfFantastic Design Patterns and Where to use them No Notes.pdf
Fantastic Design Patterns and Where to use them No Notes.pdf
02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)Predicting Test Results without Execution (FSE 2024)
Predicting Test Results without Execution (FSE 2024)
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing ToolsOld Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
Old Tools, New Tricks: Unleashing the Power of Time-Tested Testing Tools
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
OpenChain Webinar: IAV, TimeToAct and ISO/IEC 5230 - Third-Party Certificatio...
CrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNewsCrushFTP PC Software - WhizNews
CrushFTP PC Software - WhizNews
Understanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdfUnderstanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdf
B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024
Unlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by ConfluentUnlocking value with event-driven architecture by Confluent
Unlocking value with event-driven architecture by Confluent
What is Micro Frontends and Why Use it.pdf
What is Micro Frontends and Why Use it.pdfWhat is Micro Frontends and Why Use it.pdf
What is Micro Frontends and Why Use it.pdf

Boston ML - Architecting Recommender Systems

  • 1. Boston Machine Learning Architecting Recommender Systems Algorithm design, user experience, and system architecture June 2018 James Kirk
  • 2. Tools for Recommender Systems 41 - 53 Tools for building systems quickly Anatomy of Recommender Systems 3 - 19 System components and terminology Evaluating Recommender Systems 54 - 58 What makes a good recommender system? What We Missed 59 - 63 Other subjects in recommender systems Designing Recommender Systems 20 - 31 Design considerations and frameworks Example Recommender Systems 32 - 40 Real-world recommender systems and their architectures Table of contents 2
  • 4. Recommendation A recommendation system presents items to users in a relevant way. The definition of relevant is product/context-specific. Recommendation vs Personalization Personalization A personalization system presents recommendations in a way that is relevant to the individual user. The user expects their experience to change based on their interactions with the system. Relevance can still be product/context specific.
  • 7. Users A user in a recommender system is the party that is receiving and acting on the recommendations. Sometimes the user is the context, not an actual person. Users vs Items Items An item in a recommender system is the passive party that is being recommended to the users. The line between these two can be blurry.
  • 8. Example: Consultant Matchmaking (Hypothetical) *Personalized Rec Sys #1 Users: Consultants* Items: Projects Recommend projects for the consultant to bid on. Rec Sys #2 Users: Projects Items: Consultants Recommend the right consultant for the project. Rec Sys #3 Users: Enterprises* Items: Consultants Recommend consultants for relationship building.
  • 9. Positive Hearts, stars, likes, listens, watches, follows, bids, purchases, hires, reads, views, upvotes… ❤ Negative Bans, skips, angry-face-reacts, 1-star reviews, rejections, unfollows, returns, downvotes… Interactions Explicit vs Implicit Explicit actions are those that a user expects or intends to impact their personalized experience. Implicit actions are all other interactions between users and items.
  • 10. Interactions User 1 User 2 User 3 User 4 Item 1 Item 2 Item 3 Item 4 Item 5 Item 6
  • 11. Indicator Features A feature that is unique to every user/item to allow for direct personalization. These features allow recommender systems to learn about every user individually without being diluted through metadata. Often one-hot encoded user IDs or just an identity matrix. Metadata Features Age, location, language, tags, labels, word counts, pre-learned embeddings… Everything that is known about a user/item before training can be a feature if properly structured. Should it be? Often called “side input” or “shared features.” User/Item Features
  • 12. User/Item Features Indicator Features Metadata Features Encoded Labels/Tags/et c. [n_users x n_user_features] or [n_items x n_item_features] User 1 User 2 User 3 User 4 User 5 User 6
  • 13. Representation A (typically) low-dimensional vector that encodes the feature information about the user or item. Often called “embedding,” “latent user/item,” or “latent representation.” Representation size, which is the dimension of the latent space, is often referred to as “components.” Representation Functions Representation Function The process that converts user/item features in to representations. Learning happens here. Common examples: 1. Matrix factorization 2. Linear kernels 3. Deep nets 4. Word2Vec 5. Autoencoders 6. None! (Pass-through)
  • 15. Prediction A prediction from a recommender system is an estimate of an item’s relevance to the user. Predictions can be ranked for relevance. The predictions are an indirect approximation of the interactions. Prediction Functions Prediction Function The process that converts user/item representations in to predictions. Common examples: 1. Dot product 2. Cosine similarity/distance 3. Euclidean similarity/distance 4. Manhattan similarity/distance* Some systems use deep nets for prediction, and this can be an assumption-breaker. *Actually, Manhattan is rare
  • 16. Prediction Functions User Item Θ 2-Component Latent Representation Space (2-Dimensional) Common examples: 1. Dot product = User · Item 2. Cosine similarity = cos(Θ) 3. Euclidean similarity* = ( -1 * δ ) 4. Manhattan similarity = ( -1 * |User - Item| ) *There are many methods for expressing euclidean similarity δ
  • 17. Loss Function The process that converts predictions and interactions in to error for learning. Common examples: 1. Root-mean-square error (RMSE) 2. Kullback-Leibler divergence (KLD) 3. Alternating least squares* (ALS) 4. Bayesian personalized ranking* (BPR) 5. Weighted approximately ranked pairwise (WARP) 6. Weighted margin-rank batch (WMRB) *These are both a loss and representation function Loss and Learning Learning-to-rank Some loss functions learn to approximate the values in the interactions matrix. Other loss functions learn to uprank positive interactions and downrank negative interactions (and/or non-interacted items) for that user. This second category of loss functions are called learning-to-rank.
  • 18. User Features Item Features Interactions User Representation Item Representation User Representation Function Item Representation Function Prediction Function Predicted Scores Predicted Ranks Training Loss Loss Function InputData Output Data
  • 19. Y = Prediction p = Prediction function r = Representation function X = Features Ɛ = Loss s = Loss Function N = Interactions
  • 21. Interactions Features Learning What are our interaction values? We must select interaction values based on what data is available, how meaningful that data is, and how it interacts with the rest of the system. Considerations ❏ What user behaviors do our interactions represent? ❏ Explicit vs implicit? ❏ Do we allow for negative interactions? ❏ How dense are our interactions? ❏ Can our recommender handle these interactions? How does our system learn? We must select representation functions that are appropriate for our features as well as a prediction function and loss function that will learn effectively from this data. Considerations ❏ What representation functions will best encode the user/item features? ❏ What prediction function will best estimate relevance? ❏ What loss function will learn from our data most effectively? ❏ Do these choices scale? What are our user/item features? We must select user/item features from the data available, ensure that the data is meaningful to the recommender system, and ensure that our use of this data is appropriate. Considerations ❏ Do we use indicator features? ❏ What useful metadata is available? ❏ Does the metadata require feature engineering? ❏ Do users expect this metadata to impact their recommendations?
  • 22. What user behaviors do our interactions represent? Interaction values should be an approximation of the intended effect of the recommender system on user behavior. If we want people to purchase, our interactions should be related to purchases. If we want people to binge episodes of shows for longer, our interactions should be related to the act of binging. What are our interaction values? Explicit vs Implicit When the user gave you this signal, did they intend/expect it to alter their recommendations? Some explicit signals don’t work well as interactions. Negative explicit signals should be handled with simple product logic. “You might give five stars to Hotel Rwanda and two stars to Captain America, but you’re much more likely to watch Captain America.” -Todd Yellin, Netflix, You May Also Like
  • 23. What are our interaction values? Explicit vs Implicit Does the user know we are using this signal for recommendation? Does the user care we are using this signal for recommendation? Is it ethical for us to use this signal for recommendation?
  • 24. 1. Positive Positive Positive 2. Positive Positive Positive 3. No-int Negative No-int 4. No-int Negative Negative 5. No-int Negative No-int 6. No-int No-int Negative 7. Negative No-int No-int 8. Negative No-int Negative 9. Negative No-int No-int Confusing?Do we allow negative interactions? Negative interactions can be valuable statements of what content to avoid. Negative interactions can be confusing when learning-to-rank. Not all loss functions accommodate negative interactions. What are our interaction values? Which ordering is better?
  • 25. Do we use indicator features? Indicator features allow for powerful personalization but are as numerous as our users/items. Recommenders with user indicators can not effectively make recommendations for new users* (the cold-start problem). Many users means many indicator features -- this may not scale. *Vice-versa is true for new items What are our user/item features? What useful metadata is available? What user/item metadata do we have that is relevant? Metadata that is useful but missing can be requested from users, crowd-sourced, or inferred with other ML systems.
  • 26. Does the metadata require feature engineering? Pre-processing features can improve recommender learning. Some features may be useless/misleading without feature engineering. The choice of representation function impacts the usefulness of feature engineering. What are our user/item features? Do users expect this metadata to impact their recommendations? Is the use of this metadata ethical*? Users can be surprised when changing metadata impacts product experience. *There is a distinction between metadata used in training and metadata used in evaluation.
  • 27. What representation functions will best encode the user/item features? Linear kernels are effective if all we have are indicator features or well-engineered features. (Matrix factorization) More complex relationships may lead us to neural nets. How does their architecture impact the recommender? (Use of the latent space) Can the representation be learned without interaction? (Auto-encoders, word2vec, etc) How does our system learn? What prediction function will best estimate relevance? Dot-product prediction accounts for representation relevance and magnitude. Cosine prediction optimizes for relevance but has no sense for magnitude. Euclidean prediction builds a map of items but also has no sense for magnitude. Should items be biased, given our choice?
  • 28. What loss function will learn from our data most effectively? Do we want to estimate interactions, or perform learning-to-rank? Should the loss function accommodate negative interactions? (RMSE, KLD…) Should the loss function be sensitive to interaction magnitude? (RMSE, B-WMRB…) Tweaking the loss function can dramatically change how recommendations feel. How does our system learn? Sparse vs Dense vs Sampled Some implementations of loss functions only account for user/item pairs with interactions. These same loss functions can be written to compare every possible user/item pair. These predictions and losses are dense, and they can be expensive. Some of the most effective and efficient loss functions learn by comparing pairs with interactions against sampled pairs.* (WARP, WMRB) * There are many methods for sampling candidate pairs
  • 29. Example: WMRB WMRB approximates positive item rank against a random sample and upranks positive items through a hinge loss. How does our system learn? x = User y = Positive item y’ = Non-positive item Y = All items Z = Random sample of non-positive items p = Prediction function Hinge Random Sampling
  • 30. Example: Balancing WMRB If we notice an undue popularity bias, we can balance this by accounting for interaction magnitudes and popularity. How does our system learn? x = User y = Positive item X = All users p = Prediction function n = Interaction magnitude for pair (user, item) Balancing Factor
  • 31. We can think about a recommender system architecture as a set of top-level decisions. When designing recommender systems, we are evaluating the tradeoffs between these decisions and the relationships between these choices. A Framework for Recommender Systems Interactions ? User Features ? User Representation ? Item Features ? Item Representation ? Prediction ? Learning ?
  • 33. A collaborative filter learns representations from interactions and uses these to make personalized recommendations, often through matrix factorization. Pure collaborative filters are metadata-naïve. Example: Collaborative Filter Interactions * (Positive only?) User Features Indicator User Representation Linear Item Features Indicator Item Representation Linear Prediction * (Dot-product for MF) Learning ALS, BPR, SVD, PCA, NMF...
  • 34. A content-based recommender learns the item features to which a user is affined. Purely content-based systems do no transfer learning between users. This allows easy rec-splanation. This requires clean item metadata. Example: Content-based Recommender Interactions * User Features Indicator User Representation Linear Item Features Metadata Item Representation None (n_components = n_item_features) Prediction Dot-product Learning *
  • 35. A hybrid recommender system learns representations for both user and item metadata and indicators, if available. This opens a lot of options for us. Example: Hybrid Recommender System Interactions * User Features * User Representation * Item Features * Item Representation * Prediction * Learning *
  • 36. We can build a hybrid recommender system to recommend personalized products based on past purchases. Example: Purchase Recommendations Interactions Purchases User Features Indicator User Representation Linear Item Features Indicator + Metadata Item Representation * Prediction Dot-product Learning *
  • 37. We can use the pre-trained purchase recommender’s representations to provide recommendations in a new context. In this system, the “user” is the context item, not the person using our product. Example: “You May Also Like” (YMAL) Interactions X User Features Context Item Repr User Representation None Item Features All Item Reprs Item Representation None Prediction Dot-product, Cosine? Learning X
  • 38. We can take the output of the YMAL recommender and re-rank the items based on the customer’s representation. This system does not learn. The learning’s already been done. Example: Personalized “You May Also Like” Interactions X User Features User Reprs User Representation None Item Features Similar Item Reprs Item Representation None Prediction Dot-product Learning X
  • 39. Example: Personalized “You May Also Like” Purchase Recommender System “YMAL” Recommender System “YMAL” Personalization System Step 1: Learn to personalize purchasing recommendations Step 2: Use previous learning to calculate the most similar items Step 3: Personalize the similar items by re-ranking OR Contextualize purchase recommendations by limiting the item set
  • 40. Example: YouTube (Covington, Adams, Sargin) Interactions Watches + Searches User Features Geography, Age, Gender... User Representation Deep net Item Features Pre-learned embeddings, language, previous impressions... Item Representation Deep net Prediction Deep net Learning Sampled Cross-Entropy
  • 42. Implicit Interactions * User Features Indicator User Representation Linear Item Features Indicator Item Representation Linear Prediction Dot-product Learning ALS, BPR Implicit is a Python collaborative filter toolkit that uses matrix factorization to learn representations. Includes factorization classes for ALS and BPR. Made by Ben Frederickson. MIT License
  • 43. Scikit-Learn Interactions * User Features Indicator User Representation Linear Item Features Indicator Item Representation Linear Prediction Dot-product Learning SVD, PCA, NMF... Scikit-learn is a Python machine learning toolkit with many tools for feature engineering and machine learning. The decomposition package contains some classes that can be used for matrix factorization recommender systems like SVD, PCA, NMF... Maintained by volunteers. BSD license
  • 44. LightFM Interactions * User Features * User Representation Linear Item Features * Item Representation Linear Prediction Dot-product Learning Logistic, BPR, WARP LightFM is a Python hybrid recommender system that uses matrix factorization to learn representations. Made by Lyst - a fashion shopping website. Apache-2.0 license
  • 45. TensorRec is a Python hybrid recommender system framework for developing whole recommender systems quickly. Representation functions, prediction functions, and loss functions can be customized using TensorFlow or Keras. Made by James Kirk. Apache-2.0 license TensorRec Interactions * User Features * User Representation Linear, Deep nets, None... Item Features * Item Representation Linear, Deep nets, None... Prediction Dot-product, Cosine, Euclidean... Learning RMSE, KLD, WMRB... Hey, that’s me
  • 46. Annoy is a tool for fast similarity search written in C++ with Python bindings. Useful for building systems to serve recommendations from pre-learned representations. Made by Spotify. Apache-2.0 license ANNOY (Approximate Nearest Neighbors Oh Yeah) Interactions X User Features X User Representation X Item Features X Item Representation X Prediction Cosine, Euclidean, Manhattan, Hamming Learning X
  • 47. Faiss is a tool for fast similarity search written in C++ with Python bindings. Useful for building systems to serve recommendations from pre-learned representations. Allows item biases. Made by Facebook. BSD license FAISS (Facebook AI Similarity Search) Interactions X User Features X User Representation X Item Features X Item Representation X Prediction Dot-product, Euclidean Learning X
  • 48. NMSLib is a tool for fast similarity search written in C++ with Python bindings. Useful for building systems to serve recommendations from pre-learned representations. Made by Bilegsaikhan Naidan, Leonid Boytsov, Yury Malkov, David Novak, Ben Frederickson. Apache-2.0 license, with some MIT and GNU components NMSLib (Non-Metric Space Library) Interactions X User Features X User Representation X Item Features X Item Representation X Prediction Cosine, Euclidean Learning X
  • 49. We can build a hybrid recommender system to recommend personalized news articles based on past reading. Requirements: 1. We have to learn the tastes of individual users. 2. We know users’ home location with low resolution (country/state). 3. Articles are ephemeral. All items are cold-start items. 4. We can vectorize article contents and tagged categories. (politics, sports…) 5. We have to serve production-scale user traffic. 6. We don’t have to do rec-splanation. Example: News Article Recommendation Interactions Clicks, page dwells... User Features Indicator + vectorized locations User Representation Linear Item Features TF-IDF of contents + vectorized categories Item Representation Deep net Prediction Cosine Learning Balanced WMRB
  • 50. Example: News Article Recommendation Daily Model Training Scikit-learn Feature Transformation TensorRec Recommender System Annoy Ranking Step 1: Vectorize historical article contents and metadata Step 2: Use vectorized article features to learn user representations and train a deep net for article representation Step 3: Build Annoy indices
  • 51. Scikit-learn Feature Transformation TensorRec Recommender System Annoy Ranking Step 1: Vectorize new article contents and metadata Step 2: Use trained deep net to calculate new article representation Step 3: Rebuild Annoy indices with the new article Example: News Article Recommendation Handling New Articles
  • 52. Database Representation Storage Annoy Ranking Step 1: Retrieve the user representation from the database Step 2: Find most relevant articles for the user Example: News Article Recommendation Serving User Traffic
  • 53. Example: MovieLens with TensorRec Interactions Movie ratings User Features Indicator User Representation Linear Item Features Indicator + Movie Tags Item Representation Linear Prediction Dot-product Learning Balanced WMRB
  • 55. Offline Evaluation Many metrics are available for offline evaluation to comparing predictions and known interactions. Most measure novelty, diversity, and coverage. Precision@K, Recall@K, NDCG@K… Precision@K: “What percentage of the top K items were positively interacted?” Recall@K: “What percentage of users’ positively interacted items were in the top K results?” What makes a good recommender system? Offline Pitfalls Many offline metrics don’t represent fairness of performance between users or items. These metrics can be useful for hyperparameter optimization, but often fail to evaluate the “feel” of recommendations. It is hard to use offline metrics to state that one recommender system is better than another.
  • 56. Example: Offline Pitfalls Three recommendation results for two users. User 1 has 5 positive interactions. User 2 has 2 positive interactions. The third recommendation system is the most broadly effective, and probably the “best.” Precision fails to identify that, but recall does. You can concoct similar pitfalls for recall or NDCG. What makes a good recommender system? 1 2 1 2 1 2 T T T T T T T T T T T T T T T P@5: 0.5 P@5: 0.5 P@5: 0.5 R@5: 0.65 R@5: 0.5 R@5: 0.8
  • 57. Online Evaluation When rolling-out a new recommender system, the truest test is an A/B test with an existing system. The most effective feedback comes from user interviewing and monitoring the user behaviors the system is intended to drive. If there is no existing system, do phased roll-outs with quant/qual feedback.* User interviewing is the only way to evaluate the “feel” of recommendations. *Fellow employees make great, but biased, guinea pigs What makes a good recommender system? Feel? “I already own a crib, why would I need another?” Missing item filtering based on metadata? “These songs are excellent, but I already know these bands.” Maybe we should target discovery? “I’ve watched Captain America twenty times, but that doesn’t mean I only want to watch Marvel movies. What about the sitcoms I watch?” Maybe we’re oversimplifying the user’s representation?
  • 58. All Algorithms Are Biased There are biases innate in the data we use, the way users interact with our products, and the way our algorithms learn. Controlling for this is not as simple as setting biased=False. When designing these systems, we have a responsibility to, at the least, understand the biases in our products. You wouldn’t ship a product without tests. You shouldn’t ship a RecSys without examining bias. Algorithmic Bias and Fairness Understanding Fairness There are many of definitions of fairness. Some cross-section recommender performance by user and item metadata. C-fairness Is recommendation recall significantly lower for customers in Massachusetts? P-fairness Are movies with female leads recommended less often than in the natural distribution of movie watching? Missing metadata? Crowdsource it, but be careful with sensitive metadata.
  • 60. 1 2 3 4 5 6 What We Missed Sequence-based models In what order do our users interact with our items? Mixture-of-tastes models Is one representation per user enough for users with diverse tastes? Rec-splanation How do system design choices impact interpretability? Attention models Can we learn more nuance to user representation that just a vector? Graphical models Can we map relationships between users, items, and their attributes? Cold-start problems How do we make recommendations for brand-new users?
  • 61. Wait, is it “recommender systems” or “recommendation systems?”
  • 62. Wait, is it “recommender systems” or “recommendation systems?” ¯_(ツ)_/¯