You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The documentation of the Topics API for the web links to a demo on Google Colab to perform inferences with the model used by Chrome. However, this demo does not follow the same algorithm as the one executed in Google Chrome. As a result, classifications results differ.
Some differences in the Colab:
old taxonomy v1 used
pre-processing step missing: removal of the www. prefix
post-filtering of model inference scores is incorrect: Chrome's implementation is more involved (top 5 scores kept, minimum thresholds, check "Unknown" topic contribution, normalization, etc.)
I would suggest updating the Colab demo to exactly mirror Google Chrome's implementation of the Topics API for the web. This would avoid potential confusion due to classification mismatches between the Colab and Chrome implementations.
Resources
In this blog post and this paper, I describe the steps performed in Google Chrome when a hostname is classified by the Topics API, specifically see the post-filtering algorithm.
Hi, thanks for creating the issue. The colab demo that you reference was meant as a one-time demonstration on how one might extract and use the classifier model and not as an evergreen document. We leave it as an exercise to developers to look at Chrome's code and to keep up with further changes if they want to copy Chrome's behavior over time.
Problem Description
The documentation of the Topics API for the web links to a demo on Google Colab to perform inferences with the model used by Chrome. However, this demo does not follow the same algorithm as the one executed in Google Chrome. As a result, classifications results differ.
Some differences in the Colab:
www.
prefixI would suggest updating the Colab demo to exactly mirror Google Chrome's implementation of the Topics API for the web. This would avoid potential confusion due to classification mismatches between the Colab and Chrome implementations.
Resources
In this blog post and this paper, I describe the steps performed in Google Chrome when a hostname is classified by the Topics API, specifically see the post-filtering algorithm.
Here is my correct reimplementation of the classification performed in Google Chrome: https://github.com/yohhaan/topics_classifier
The text was updated successfully, but these errors were encountered: