Commons:Bots/Work requests

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Shortcuts: COM:BR • COM:BWR

SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days.


# Bot request Status 💬 👥 🙋 Last editor 🕒 (UTC) 🤖 Last botop editor 🕒 (UTC)
1 Add Template:User category (CL) 10 4 DaxServer 2024-07-15 15:22 DaxServer 2024-07-15 15:22
2 DGJ file descriptions from Flickr 2 2 DaxServer 2024-07-14 19:40 DaxServer 2024-07-14 19:40
3 Hidden categories added as Category:Hidden categories
Resolved
6 3 Fl.schmitt 2024-07-14 10:36 Fl.schmitt 2024-07-14 10:36
4 sorting files 9 3 Hanooz 2024-07-15 16:00 DaxServer 2024-07-15 13:58
5 Revert additions to Category:History by Mitte27 7 5 Enhancing999 2024-06-30 11:06 Cryptic-waveform 2024-06-25 13:04
6 Convert Category:Photographs by Carol M. Highsmith to JPEG 16 5 Jeff G. 2024-07-13 12:59 Jeff G. 2024-07-13 12:59
Legend
  • In the last hour
  • In the last day
  • In the last week
  • In the last month
  • More than one month
Manual settings
When exceptions occur,
please check the setting first.


Please add {{User category |1=Chris Light}} to the user categories of @Chris Light: . This ensures they are in Category:User categories (flat list) and marked as hidden categories.

To find them: https://petscan.wmflabs.org/?psid=28135244

I fixed the few inclusions that weren't user categories (or shouldn't be): sample. A few are still to fix:

About 60 are currently not hidden categories: sample. They will be once the template is added. Enhancing999 (talk) 09:49, 29 April 2024 (UTC)[reply]

This can be easily done using AutoWikiBrowser (see example diff). Getting the categories into AWB is quite easy - in Petscan, select "Wiki" as output format and save the result as txt file. Now, in AWB, you can use that file to generate a list by selecting "Text file (UTF-8)" as source. The only problem is that a bot-enabled account would be useful for this, otherwise you'll have to manually confirm 4540 edits... Fl.schmitt (talk) 09:57, 14 July 2024 (UTC)[reply]
I have a bot account at DaxBot, I can take on this work. Let me know -- DaxServer (talk) 14:33, 14 July 2024 (UTC)[reply]
@DaxServer: Would be great - thanks a lot! I'm not sure if a bot additionally needs AWB access. I've got AWB access, but my bot was approved for different tasks... Anyway, after AWB approval, creating the task in AWB is very easy - since the only modification required is prepending the User category, no regex, no replacements or deletions. Fl.schmitt (talk) 15:22, 14 July 2024 (UTC)[reply]
@Fl.schmitt My bot was approved for different tasks too and will need to apply for approval for this one-time task. I'd use the https://doc.wikimedia.org/pywikibot/stable/scripts/main.html#replace-script to replace __HIDDENCAT__ with {{User category|Chris Light}}. If the template is already inserted, I'll skip. Here is the command:
pwb replace -subcatsr:"Image by Chris Light" -summary:"Summary goes here" -ns:Category -excepttext:"{{User category" "__HIDDENCAT__" "{{User category|1=Chris Light}}"
I'll check the petscan query after the run and handle the missed ones. How does this sound? -- DaxServer (talk) 17:34, 14 July 2024 (UTC)[reply]
@DaxServer: Sounds great, thanks a lot for the explanation - i'm still learning! I didn't think about using one of the pwb standard scripts - that's really useful! Fl.schmitt (talk) 19:05, 14 July 2024 (UTC)[reply]
Cool. Filed it Commons:Bots/Requests/DaxBot (4) -- DaxServer (talk) 19:35, 14 July 2024 (UTC)[reply]
I added -titleregexnot:"C(hris )?Light" to the command so that categories like Category:Hazards_(LTNP) are skipped. -- DaxServer (talk) 15:22, 15 July 2024 (UTC)[reply]
Sounds good to me. I've been making corrections as I return to existing categories. If it can be automated, great. You've got my approval. Chris Light (talk) 21:51, 14 July 2024 (UTC)[reply]

DGJ file descriptions from Flickr

[edit]

Files like this have some duplicated text and content that is of interest mainly to Flickr users. In the past, I cleaned up some pages myself, but there are actually plenty of them, see Special:Search/"PLEASE, NO invitations or self promotions, THEY WILL BE DELETED." (about 10000). Also Category:Dennis G. Jarvis and Special:Search/"Dennis G. Jarvis" (27000).

Possibly Creator:Dennis G. Jarvis template could be added at the same time. Enhancing999 (talk) 07:38, 5 May 2024 (UTC)[reply]

@Enhancing999 Do you have example diffs of the cleanup? -- DaxServer (talk) 19:40, 14 July 2024 (UTC)[reply]

Hidden categories is a system category added by __HIDDENCAT__

However, some files and even categories add it as regular categories: [[Category:Hidden categories]]

To find some: [1] (currently 468 in category namespace). Enhancing999 (talk) 13:22, 2 June 2024 (UTC)[reply]

I've reduced the numbers with Com:Cat-a-lot. The rest probably should be gone through manually. Jonteemil (talk) 23:04, 3 June 2024 (UTC)[reply]
Shouldn't they be replaced with __HIDDENCAT__? This finds those lacking that. Enhancing999 (talk) 23:18, 3 June 2024 (UTC)[reply]
I'm not sure all 128 categories really should be hidden. That's why I suggest they be gone through manually. Jonteemil (talk) 11:52, 4 June 2024 (UTC)[reply]
Currently 54 hits.  Support fixing this. [[Category:Hidden categories]] should NOT appear. — Preceding unsigned comment added by Taylor 49 (talk • contribs) 14:12, 26 June 2024 (UTC)[reply]
I think is done now. I've edited most of the remaining 43 categories using AWB. I was unsure about Category:Vector files with non-modifiable text since there, Category:Hidden categories is used as piped link.
Checkmark This section is resolved and can be archived. If you disagree, replace this template with your comment. Fl.schmitt (talk) 10:36, 14 July 2024 (UTC)
Fl.schmitt (talk) 10:36, 14 July 2024 (UTC)[reply]

sorting files

[edit]

Please help me sort files in the subcategories of Category:Photographs in the Golestan Palace Library by number. Sortkeies should be in three digits as there might be more than a hundred files in each album. Hanooz 15:18, 7 June 2024 (UTC)[reply]

@Hanooz: this seems to be done, too - is this correct? If not, please comment. Thank you!
Section not resolved| (talk) 10:36, 14 July 2024 (UTC)}} Fl.schmitt (talk) 10:36, 14 July 2024 (UTC)[reply]
It's not, unfortunately. Hanooz 16:10, 14 July 2024 (UTC)[reply]
OK, i've removed the resolved template (sorry, i didn't understand first that you want to sort the files inside the subcategories, not into the categories...). --Fl.schmitt (talk) 17:07, 14 July 2024 (UTC)[reply]
@Hanooz Is this the format - [2] [3] ? -- DaxServer (talk) 19:39, 14 July 2024 (UTC)[reply]
008.2 (or 008-2) for File:Golestan Palace Album No. 100-8.2.jpg and 008.1 (or 008-1) for File:Golestan Palace Album No. 100-8.1.jpg. What comes after the dot (1 or 2) is recto/verso. Hanooz 19:59, 14 July 2024 (UTC)[reply]
@Hanooz Here is what I gather: https://commons.wikimedia.org/w/index.php?title=User:DaxServer/sandbox&oldid=899125871 from Petscan https://petscan.wmcloud.org/?psid=28923652 I omitted the first few which do not have the pattern "Golestan_Palace_Album_No._" in the title. Please edit them manually setting the desired sortkey. If the table looks good, I can file for the bot and can do the edits. Let me know -- DaxServer (talk) 13:58, 15 July 2024 (UTC)[reply]
Looks great to me. Thanks. Hanooz 16:00, 15 July 2024 (UTC)[reply]

Revert additions to Category:History by Mitte27

[edit]

Thousands of uncategorized files were added to the already-bloated Category:History. All of the edits I find were on 31 May 2024. Could some please automatically revert these edits? Thanks. Cryptic-waveform (talk) 20:55, 24 June 2024 (UTC)[reply]

I don't think it's a good idea to return it. My idea was to then move the files from "Category:History" to more specific categories. --Mitte27 (talk) 09:59, 25 June 2024 (UTC)[reply]
The current status is that thousands of files that were correctly marked as Uncategorized, and therefore easily visible to contributors doing a first round of categorization, are now erroneously categorized in a top-level category. Cryptic-waveform (talk) 13:04, 25 June 2024 (UTC)[reply]
@Mitte27: so when do you plan to move the images to more specific categories? This is clearly not an indefinite solution. —Matrix(!) {user - talk? - uselesscontributions} 18:55, 26 June 2024 (UTC)[reply]
I sorted out some photos related to the history of Russia/USSR, but I have little understanding of American history, and most of the photos in the category are related to it. In any case, this category is better than none. --Mitte27 (talk) 22:29, 26 June 2024 (UTC)[reply]
There is no reason to ever place files into extremely broad categories like Category:History. Please do not remove {{Uncategorized}} unless you are able to either accurate place a file in the most specific categories available or into a dedicated cleanup category. Pi.1415926535 (talk) 00:22, 27 June 2024 (UTC)[reply]
You could just use cat-a-lot. I don't think adding all LOC or NARA images to "History" by default is a good idea. Enhancing999 (talk) 11:06, 30 June 2024 (UTC)[reply]

Category:Photographs by Carol M. Highsmith is an excellent Library of Congress collection of very good images. Unfortunaly, all those images are in TIFF format, which means that the average file size is 100-300 MB, which is incredibly large. It causes long loading times of even the preview image (let alone the actual file), and TIFF file format is not supported by most browsers or general applications. Wikipedia discourages using TIFF files for those reasons, and this reduces the likelyhood of those excellent images being used.

Therefore, some bot should convert those TIFFs to JPEGs, copy the descriptions/categories and make sure the files reference each other. Further, the categories from the TIFF files should be replaced with Category:LC TIF images with categorized JPGs TheImaCow (talk) 21:59, 30 June 2024 (UTC)[reply]

@TheImaCow Thanks for finding this. I've filed for a bot Commons:Bots/Requests/ImageConverterBot -- DaxServer (talk) 15:13, 1 July 2024 (UTC)[reply]
I didn't expect someone to reply to this so quick, thank you!
I came across this series via Category:Aerial photographs of the United States and subcats, which contains many poorly categorized images from this collection. TheImaCow (talk) 16:40, 1 July 2024 (UTC)[reply]
LCCN2013631230.tif shows a jpg and several jpg-sizes are offered. Is this really needed? Enhancing999 (talk) 18:44, 1 July 2024 (UTC)[reply]
Hmm, I didn't notice that. It seems it is not necessary after all -- DaxServer (talk) 20:56, 1 July 2024 (UTC)[reply]
Maybe I'm blind, but where are those files offered? It's not the "Download/Use this file/Email a link" bar, all resolutions there only download the same low-quality preview generated by the Mediawiki software (which is shown on the file description page) TheImaCow (talk) 21:19, 1 July 2024 (UTC)[reply]
Below the image, there is a line:
"Size of this JPG preview of this TIF file: 800 × 533 pixels. Other resolutions: 320 × 213 pixels | 640 × 427 pixels | 1,024 × 683 pixels | 1,280 × 853 pixels | 2,560 × 1,707 pixels | 6,144 × 4,096 pixels."
The last one matches the tiff. Enhancing999 (talk) 21:52, 1 July 2024 (UTC)[reply]
Oh thanks I see. However this is very obscure and when embedding the file anywhere, it will always refer to the TIF version - so an seperate JPG should probably still be uploaded, like the 220,000 other TIF files in Category:LC TIF images with categorized JPGs (or the 58,000 Category:NARA TIF images with categorized JPGs)
But I don't have strong opinions on this. TheImaCow (talk) 22:11, 1 July 2024 (UTC)[reply]
Loading a file to test this -- DaxServer (talk) 05:51, 2 July 2024 (UTC)[reply]
Possibly support for tiffs was less developed when they were uploaded. I wonder how all those thousands of duplicates are curated and how much volunteer time is lost by handling two instead of just one copy of every image. WMF recently expressed their view on hosting files on Commons that aren't used on WMF sites [4]. Enhancing999 (talk) 09:22, 2 July 2024 (UTC)[reply]
Well, in theory, those TIF duplicates shouldn't need any curation, as they are supposed to be dumped into the massive categories mentioned above, and only linked from the description of the maintained JPG version.
The use of TIF is something I think is generally not needed for 99.9% of files, modern compression is more than good enough.
(I don't oppose eventually getting rid of the TIF duplicates, but there is not even consensus to delete de-facto duplicates where one version is rotated differently by single degrees, or random low quality TIF scans of generic text documents, where the same scans are also uploaded as JPG, so forget it) TheImaCow (talk) 13:28, 2 July 2024 (UTC)[reply]
Oddly, I can't figure out which one of the two maps is correct ;) Did you nominate the wrong one? For the text ones, I'd have nominated the jpg ones. The assumption that deletion doesn't save anything is incorrect: deletion reduces curation (even if theoretically none is needed, it still happens and wastes volunteer time), limits spamming of Special:search, can even save storage space as files can be purged (from non-public view) or wont be exported twice when requested.
As technology changes, I think views on this evolve. NARA's approach might have been the ideal 15 years ago, but other GLAMS that started only more recently use different approaches. Enhancing999 (talk) 12:13, 3 July 2024 (UTC)[reply]
Not sure what you mean. Both maps are exactly the same. JPG ones nominated instead? Ideally someone uploads a PDF and 307 files are replaced with one in the correct format for documents. I never said I oppose deletions, I said the exact opposite.
The NARA approach has actually changed - there have been at least two bulk uploads, one in 2011 and the other 2019.
The 2011 one uploaded nearly every image twice - one TIF+one JPG. The 2019 one uploaded only JPGs.
Looking at the NARA catalogue, files uploaded earlier have often TIF,JPG and sometimes GIF versions for download. Images uploaded 2019, presumably digitized later, have only high-resulution JPGs for download. TheImaCow (talk) 18:33, 3 July 2024 (UTC)[reply]
It's better to have the lossless files than a JPEG, as you can always make a JPEG from a lossless file, but you can't make a lossless from a JPEG. Still, while we shouldn't delete the TIFFs, we should make JPEG options. Adam Cuerden (talk) 08:52, 4 July 2024 (UTC)[reply]
If we want to offer lossless files in a reasonable sizes (2MB vs 200MB), we might want to consider offering PNGs instead of JPEGs -- DaxServer (talk) 08:57, 4 July 2024 (UTC)[reply]
@DaxServer: Please don't, PNG images look fuzzy when scaled down (due to design decisions discussed in phab:T192744) on WMF projects.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 12:59, 13 July 2024 (UTC)[reply]

Template:Unknown

[edit]

The IP range 64.189.18.0/24 has been blocked for removing the template:Unknown may times (example). I was reverting tens of these edits, but there appear to be hundreds. Could a bot operator revert these edits? Wikiwerner (talk) 17:18, 15 July 2024 (UTC)[reply]