Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback on epsilon for summary reports #485

Open
csharrison opened this issue Jun 8, 2022 · 0 comments
Open

Feedback on epsilon for summary reports #485

csharrison opened this issue Jun 8, 2022 · 0 comments

Comments

@csharrison
Copy link
Collaborator

csharrison commented Jun 8, 2022

Summary reports currently have an adjustable dial (called epsilon) which affects noise in the API. The resulting noise is distributed according to the Laplace distribution with a mean of zero and a standard deviation

sqrt(2) * L1 / epsilon

where (during the initial origin trial) the L1 is currently equal to 2^16 = 65536.

Please use this issue to submit feedback on what values of epsilon work for you using the following template:

Background

Describe your high-level use case in a sentence or two. If you are comfortable doing so, please share your affiliation so that we can understand the different types and sizes of stakeholders reflected in the responses.

Smallest epsilon required

Please inform us of the smallest value of epsilon required to support the minimum viable functionality of your system. Explain as best as you can why this epsilon is needed. For example:

“The smallest epsilon we can tolerate is 3, this is because for any impression, we only ever measure one single conversion to generate per-campaign count breakout. An epsilon of 3 allows us therefore to measure per-campaign counts with a standard deviation of ~.5, which is the maximum noise we could tolerate on our smallest slices (counts of ~10).”

Major inflection points

Are there any other inflection points with respect to epsilon that you discovered when experimenting with the API? For example:

  • Is there a value of epsilon below which data is completely useless?
  • Are there values of epsilon that allow you to achieve a subset of functionality, but not your complete desired functionality?
  • What variables impact decision making for setting the epsilon value?

Other feedback on noise generation

Use this space to give other general feedback on the noise in the API. For example:

  • Did you have problems specific with the Laplace noise, vs. say, Gaussian noise?
  • Did you have any issues scaling your data by the L1 sensitivity (2^16)?
  • It is very difficult for us to use the budget without the use of adaptive queries
  • Composing many features together in the API is hard with the current budgeting setup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants