Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

purpose specification: make the purpose explicit, or, if general-purpose, update to reflect the wider range of uses/abuses #216

Open
npdoty opened this issue Jun 30, 2023 · 4 comments

Comments

@npdoty
Copy link

npdoty commented Jun 30, 2023

The justification for this API is to enable support for interest-based advertising without relying on cross-site tracking of user's browsing activity.

There are significant privacy advantages to designed-for-purpose APIs. Legal compliance in many jurisdictions would also be more straightforward. As a principle of privacy and data protection, information provided for one purpose should not generally be used for other purposes, and people cannot generally consent to unspecified purposes. If there is a clear purpose, then users can make more informed choices about it, attestations can be made about it, and abuses of the API for other purposes can be identified and mitigated. If interest-based advertising for the current visit is the purpose of the API, we should make that explicit in specs/explainers and in attestations about use of the API.

However, if the purpose is not intended to be limited to the interest-based advertising use case -- if, for example, the goal is to support personalizing pages based on browsing history or to enable dynamic pricing (#34) or to support selling data about a person's interests to data brokers, then we should at a minimum update specs/explainers to cover those cases. I am not a European data protection lawyer, but I would anticipate that this could create compliance burdens for browsers that implement the API and sites that consume it.

This is (somewhat) new ground for Web features because it includes inferences of personal data and providing personal data for a particular purpose rather than enabling a general communication mechanism between user agent and server. It could also be an opportunity for better privacy practices.

@michaelkleber
Copy link
Collaborator

Hi Nick, sorry for the delay in responding here.

The intent behind the API design is indeed, just as you say, "to enable support for interest-based advertising without relying on cross-site tracking of user's browsing activity". As with any browser API, it's entirely possible that someone will come up with another idea for how to use it. Presumably they would then try it out and see whether the API is, in practice, actually useful for that other goal. I don't know what the result would be; we've put years of work into making this API useful for one thing, and some of these alternate applications seem far-fetched to me.

But sure, let's say that hypothetically Reddit finds that they actually get more engagement by picking which articles to show on the homepage based on the viewer's topics. How should browsers think about this? I can broadly think of three points of view here:

  1. The "old school" way: Anything goes, browser has no opinion on how the data is used. This is no different than Reddit choosing to show you more Apple-related stories if your User Agent string says you're browsing on an iPhone.

  2. The "data steward" way: Browser believes this data should only be used for ad selection, takes action against Reddit because this is "using it wrong".

  3. The "tracking prevention" way: Browser believes that this data must not be used for tracking, but for non-tracking uses that don't seem abusive, browsers do not interfere.

Option 3 here is where Privacy Sandbox has generally landed: there is a clear harm that we're trying to prevent, but there is also a wide area in between "deliberately trying to support" and "deliberately trying to prevent" where we are not in a position to really adjudicate whether other uses are good or bad. (Heh, I just realized that this hypothetical Reddit personalization use case is one good example of why: frankly I have no idea whether or not that should count as "interest-based advertising" — there genuinely is advertising work on site-internal links for increasing engagement.)

I am coincidentally also not a European data protection lawyer. But it seems to me that if the party that actually uses the Topics signal has some disclosure obligation about how they use data, then the obligation ought to apply equally, whether or not they use the data the way that we browser folks imagined it would be used.

@npdoty
Copy link
Author

npdoty commented Aug 1, 2023

I'm not sure I see the bright line between your approaches 2. and 3. Are there uses of these inferences that are explicitly different from how they were described to the user that aren't abusive? Information provided for one purpose should not generally be used for other purposes, and people cannot generally consent to unspecified purposes.

Alternatively, is there a reason attestation is used as a way to address accountability of only one abuse?

Where I live, the social media site I browse to could (legally, as far as I know, and without violating the proposed not-used-for-tracking attestation) combine the inferred interests received from my browser through this API with what they know about my name/email address and sell that data to a data broker, who could sell that information on to others in order to enrich their profiles about my name/email address. Attestation would be one way to mitigate that abuse (by allowing out-of-band accountability for sites that used that information for other purposes, like selling it to a data broker) and make users more comfortable with sharing inferences about their ad interests.

For your hypothetical, perhaps it isn't immediately clear whether on-site personalization of content based on ad interest topics is consistent with providing topics for interest-based advertising and what users understand or intend when they make use of this feature. We could clarify that, or we could leave it ambiguous so that users have less clarity about how their data is used and so that sites have less clarity about what uses of data are acceptable. I believe our experience on the Web has shown that this ambiguity tends over time to be interpreted by sites using data for any potentially lucrative purpose and users assuming their behavior is constantly surveilled. I would like us to pursue a better path going forward.

@michaelkleber
Copy link
Collaborator

I'm not sure I see the bright line between your approaches 2. and 3.

The simplest differentiator I can point at is an allow-list vs deny-list approach. But more generally, I am extremely reluctant to go down any path where we try to say that novel or unclear things are forbidden.

Chrome, or browsers in general, do not have a "judiciary branch" where we can build precedent and nail down the details over time — that is very much the territory of regulators. We certainly don't want to pretend to be a regulator, nor do we want to imply that we know what regulators of the future will decide.

Asking callers of the API to publicly attest that "we will not use this data to identify people across sites" means that if some party does use the data for cross-site identification, then their regulator is in a position to enforce whatever penalty is appropriate for a company saying one thing and doing the opposite. All other "compliance burdens for browsers that implement the API and sites that consume it" are just like they would be for an API that did not come with a public attestation at all.

We could clarify that, or we could leave it ambiguous so that users have less clarity about how their data is used and so that sites have less clarity about what uses of data are acceptable.

I sympathize with what you're asking for, but the big question here, "clarity about what uses of data are acceptable", must lie with regulators and cannot possibly be a browser decision. It's possible that browsers could strengthen the attestations beyond just the single clear statement we're starting with — maybe that should be future work. But given how new this gating mechanism is, I think it's prudent for us to wade in cautiously.

@npdoty
Copy link
Author

npdoty commented Sep 25, 2023

We could clarify that, or we could leave it ambiguous so that users have less clarity about how their data is used and so that sites have less clarity about what uses of data are acceptable.

I sympathize with what you're asking for, but the big question here, "clarity about what uses of data are acceptable", must lie with regulators and cannot possibly be a browser decision. It's possible that browsers could strengthen the attestations beyond just the single clear statement we're starting with — maybe that should be future work. But given how new this gating mechanism is, I think it's prudent for us to wade in cautiously.

Perhaps this isn't a decision for regulators or browser vendors but instead for users. Indeed, I believe existing data protection regulation in many jurisdictions already contains such protections. Data provided for one purpose is not to be used for other purposes. The user agent can be an agent of the user and let the user specify the acceptable purpose (or not provide the data if the recipient hasn't committed to what purposes they will use it for), or we can leave it up to sites, and perhaps eventually to regulators.

I agree that it's prudent to be cautious here, given our extensive past experience with the abuse of Web APIs for surveilling and profiling users. The cautious path, it seems to me, is to have clear, narrow scope about uses, not to expose data and leave its use open to novel experimentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants