Create Special:PageData as a canonical entry point for machine readable page data.
Closed, ResolvedPublic

Description

Special:PageData serves as a data retrieval endpoint, as described by T161527: RFC: Canonical data URLs for machine readable page content. Its purpose is not to provide an API, but to offer a unified naming scheme for accessing machine readable page content via any API.

The initial implementation of Special:PageData should do the following:

  • From the request, extract a page title and a slot name:
    • Subpage syntax should be supported; e.g. Special:PageData/main/User:Foo means slot "main", title "User:Foo". If subpage syntax is used, both, slot and title, must be given.
    • Simple request parameters can be used; e.g. Special:PageData?title=User:Foo&slot=main means slot "main", title "User:Foo". The slot parameter is optional and defaults to "main".
    • If no title is given in the request, a message explaining the purpose of the special page should be shown. A form for entering title and slot would be nice, but it not initially required.
  • Check that the given slot is supported by the given page.
    • Until T107595: [RFC] Multi-Content Revisions is implemented, all pages support the "main" slot, and only the "main" slot. So for now, this can just check that the slot parameter is "main".
  • If the client send an Accept header for content negotiation, check that the page's (default) serialization format (mime type) is compatible with that header. If not, send status 406.
    • It should be possible to extend/override content negotiation using a hook.
  • Respond with a HTTP redirect (status 303) to the page's raw data
    • The default target for the redirect is the action=raw API, i.e. $title->getFullUrl( [ 'action' => 'raw' ] ). The slot name can be added in the future. A setting should be provided that allows this to be changed to use the MediaWiki REST API.
    • Call a hook that allows modifying the redirect target. For instance, Wikibase may redirect to its own Special:EntityData instead of action=raw.
    • The redirect should be marked as non-cacheable for now. Caching the redirect would require web caches to vary on the Accept header.
  • Special:PageData should not be listed on Special:SpecialPages.

Event Timeline

I have been working on this in the past couple of days, the HTTP content negoation is not possible without moving most parts of LinkedData namespace classes into core which I think is good. @daniel: Do you think we should move HttpAcceptNegotiator, HttpAcceptParser, EntityDataFormatProvider (with some modifications) to core? Their codebase mostly not related to Wikibase as far as I checked.

@Ladsgroup: HttpAcceptNegotiator and HttpAcceptParser can and should be moved to core, yes.

I don't think we need the equivalent of EntityDataFormatProvider in core: Special:PageData dosn't need to (and really cannot) use file name extensions, or API serialization formats, or format names.

All that Special:PageData needs is a list of supported mime types for a given page. That list is provided by ContentHandler::getSupportedFormats() and can be checked using isSupportedFormat(). But perhaps we should even support just one format per page for now, the one returned by ContentHander::getDefaultFormat(). If there is just one format, we may not even needed HttpAcceptNegotiator for now.

Change 356121 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Move HttpAcceptNegotiator and HttpAcceptParser from Wikibase to core

https://gerrit.wikimedia.org/r/356121

@daniel : I made a patch for moving these classes for now. Once is merged, I give the Special:PageData another try to see what I can do about it.

Change 356616 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Start a very basic version of Special:PageData

https://gerrit.wikimedia.org/r/356616

Change 356121 merged by jenkins-bot:
[mediawiki/core@master] Move HttpAcceptNegotiator and HttpAcceptParser from Wikibase to core

https://gerrit.wikimedia.org/r/356121

Change 356616 merged by jenkins-bot:
[mediawiki/core@master] Start a very basic version of Special:PageData

https://gerrit.wikimedia.org/r/356616

Change 358360 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Use target instead title in SpecialPageData

https://gerrit.wikimedia.org/r/358360

Change 358360 merged by jenkins-bot:
[mediawiki/core@master] Use "target" instead "title" as the param name in SpecialPageData

https://gerrit.wikimedia.org/r/358360

Change 358372 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Move HttpAccept* to libs

https://gerrit.wikimedia.org/r/358372

Change 358372 merged by jenkins-bot:
[mediawiki/core@master] Move HttpAccept* to libs

https://gerrit.wikimedia.org/r/358372

Change 358379 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/Wikibase@master] Drop HttpAccept* from Wikibase and use moved ones from core

https://gerrit.wikimedia.org/r/358379

Change 358379 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Drop HttpAccept* from Wikibase and use moved ones from core

https://gerrit.wikimedia.org/r/358379

Change 358540 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Make Special:PageData accept two-part subpage

https://gerrit.wikimedia.org/r/358540

Change 358540 merged by jenkins-bot:
[mediawiki/core@master] Make Special:PageData accept two-part subpage

https://gerrit.wikimedia.org/r/358540

We can call this done now, feel free to re-open it if you think otherwise.

Given the large number of applications, a coding convention seems desirable in the PageData pages.
The usual standards could be a coding convention including JS, Lua ... where some eventual wikicode can be inside values.
Do we need a coding convention, limited but extensible, and discussions to extend it?

How to use Special:PageData? How to create them? How to read them?
Could you answer these questions in the Lua reference manual? and/or elsewhere?

7 years since, same question. Where is Special:PageData documented apart from this task? Couldn't find anything.

That's mostly because the project is unfinished and doesn't do anything. T163921 and many of the subtasks are still open. T161527 contains most of the information you are probably looking for. I also find T176764 quite interesting as it shows how this was meant to be used. But this never happened, as far as I can tell.