Commons:Batch uploading
Bot policy and list · Requests to operate a bot · Requests for work to be done by a bot · Requests for batch uploads |
COM:BATCH
Commons Batch Uploading is a project to centralize the uploading of a collection of files, that have released their work as PD or any Commons compatible license. The files would be assigned to a bot operator who would see how the request would be fulfilled. (To upload batches from Flickr, please make requests on Commons:Flickr batch uploading)
Before you request a batch upload here, please read the guide to batch uploading first.
See w:Wikipedia:Public domain image resources for potential future batch uploads.
Create your Upload request:
Add your Upload request under one of the following sections:
|
|
Contents
- 1 Scripters
- 2 Tools
- 3 Requests
- 4 Batch uploads in progress
- 5 Batch uploads on hold
- 6 Past batch uploads
- 7 Failed
Scripters[edit]
- Multichill (talk · contribs)
- Aude (talk · contribs) - including batch audio & video uploads
- Jarekt (talk · contribs)
- Slick (talk · contribs) - no audio/video
- Fæ (talk · contribs) - see project list
- Husky (talk · contribs)
- Basvb (talk · contribs)
Currently inactive[edit]
Tools[edit]
- See Commons:Upload tools. The Python Wikipedia Bot framework supports image uploads and is particularly versatile.
- Commonist - free Java program to upload large numbers of files to Commons
- d:Help:QuickStatements - tool for batch upload of metadata to Wikidata, which can be than accessed by {{Artwork}} and other templates.
- Upload Script by Erik Möller
- Flickrripper allows batch uploading from a set, group or a user id on flickr.
- the GLAMwiki toolset was built by Europeana to quickly get whole collections into Wikimedia.
- We need tools to facilitate rapid, accurate categorization of many images at once.
Scripts, Examples and Information[edit]
- the scripts I using on jobs here and here
- a bash script to extract the VRINs on (U.S. military) pictures on commons, can very usefull to find duplicate before upload
- Details about 'Zoomify' images and how to get it (in german)
- Howto import images from news.kremlin.ru: import news.kremlin.ru news gallery.sh & import news.kremlin.ru photo gallery.sh
Requests[edit]
VOA News files[edit]
- Source to upload from: https://web.archive.org/web/*/https://www.voanews.com/mp3/voa/english/nnow/NNOW_HEADLINES.mp3
- Do the media URLs follow a pattern? They all have the same name. The date when archived is given in 14 digits, with the first eight digits being the year, month, and day respectively.
- Does the site have an API? Don't know.
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?) Don't know.
- Did you contact the site owner? No need to, since U.S. government works so public domain.
- Describe the works to be uploaded in detail (audio files, images by …): VOA news headline audio files for (almost) every day spanning from 5 May 2009 to 7 September 2018.
- Which license tag(s) should be applied? Template:PD-USGov-VOA
- Is there a template that could be used on the file description pages? Do you think a special template should be created? Just use the standard one. Upload as "VOA News Headlines (MONTH DAY, YEAR)". If possible, upload them in FLAC, WAV, and OGG.
– Illegitimate Barrister (talk • contribs), 08:47, 26 July 2018 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
HiRISE[edit]
- Source to upload from: https://www.uahirise.org/catalog/
- Do the media URLs follow a pattern? Yes! (Based on the catalog ID)
- Does the site have an API? No!
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?)
- Full index tree of the images on the site is accessible via:
- With extra files in:
- Each image file *.JP2 (sample [Big file]) accompanies the additional information in a separate label file *.LBL in PDS format (sample)
- Did you contact the site owner? Nope!
- Describe the works to be uploaded in detail (audio files, images by …):
- Images by HiRISE (High Resolution Imaging Science Experiment)
- Which license tag(s) should be applied?
- As explained in each image's description page for example: "All of the images produced by HiRISE and accessible on this site are within the public domain: there are no restrictions on their usage by anyone in the public, including news or science organizations. We do ask for a credit line where possible: NASA/JPL/University of Arizona"
- PD-USGov-NASA or a variation of it to include JPL and University of Arizona must be used.
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
- There is no template yet. It must be created to include all the relevant data e.g. Acquisition date, Latitude , Longitude , etc. from the label files.
- Note: Due to JPEG2000 not being currently supported on Wikimedia Commons, a conversion to PNG is also needed. File sizes may be large!
Meisam (talk) 21:58, 20 June 2018 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
freepd.com[edit]
Site contains production music tracks, in various genres, mp 3 format.
- Source to upload from:
- Do the media URLs follow a pattern?
None found. Tracks seem to be in sub-directories related to nominal genre, MP3 files are named for the track title apparently.
- Does the site have an API?
Unknown.
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?)
Unknown.
- Did you contact the site owner?
Site owner not contacted.
- Describe the works to be uploaded in detail (audio files, images by …):
"Production music", in various genres., in MP3 format.
- Which license tag(s) should be applied?
Site claims tracks are in the public domain:- http://freepd.com/faq.html ; However some of these tracks were previously under CC-BY on the site owners other site at incompetech.
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
{{information}} with additional field as was done on the previous batch upload for incompetech.
ShakespeareFan00 (talk) 10:20, 18 December 2017 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
Commons:Batch uploading/timbeek.com/[edit]
- Source to upload from:
http://timbeek.com/ in particular music tracks listed in http://timbeek.com/royalty-free-music/isrc/
- Do the media URLs follow a pattern?
No general pattern, but there's a master list (not sure if it's complete) of track pages here - http://timbeek.com/royalty-free-music/isrc/, Donwload links in the UI seem to link to numbered subdirectories, but general pattern undetermined or not obvious.
- Does the site have an API?
Unknown.
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?)
Unknown
- Did you contact the site owner?
Site owner not contacted.
- Describe the works to be uploaded in detail (audio files, images by …):
A Small set of 'production music' tracks, in assorted genres.
- Which license tag(s) should be applied?
See: http://timbeek.com/royalty-free-music/license/ , assuming attribution requirments are met the music appears to be under CC-BY 4.0. (see also: http://timbeek.com/royalty-free-music/faq/ and http://timbeek.com/royalty-free-music/copyright/)
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
{{information}} with additional fields as was previously implemented for the incomptech.com batch upload(this site seems to use a simmilar approach).
ShakespeareFan00 (talk) 19:05, 15 December 2017 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
Images of listed buildings by Stephen Richards on Geograph.org.uk[edit]
- Source to upload from: http://www.geograph.org.uk
- Do the media URLs follow a pattern? Yes: http://www.geograph.org.uk/photo/[ID]
- Does the site have an API? Yes: http://www.geograph.org.uk/help/api
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?) Don't know
- Did you contact the site owner? No need
- Describe the works to be uploaded in detail (audio files, images by …):
- All photographs of listed buildings by this user are of high quality and are tagged [listed building]. They would be very useful to have on Commons as every listed building has an item on Wikidata. I'd like them to be uploaded en masse and given the categories Category:Listed buildings in [county or London borough] and Category:Images by Stephen Richards. I could then further refine the listed building categories manually. However, the terms "Grade I", "Grade II*" and "Grade II" (the three listing grades for buildings in England and Wales) appear in the image descriptions, so is there a way that these could be picked out and used to categorise the images on Commons?
- Which license tag(s) should be applied?
- {{Geograph}}
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
- {{Geograph}}
Ham II (talk) 19:50, 16 November 2017 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
USDA NRCS Plants Database[edit]
- Source to upload from: http://plants.usda.gov/
- Do the media URLs follow a pattern? Yes.
- Does the site have an API? No.
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?) valid XHTML
- Did you contact the site owner? No.
- Describe the works to be uploaded in detail (audio files, images by …): Public domain: 10771 photos and 7064 line drawings, with species information for categorization. There are other copyrighted images as well, some of which may be freely licensed.
- Which license tag(s) should be applied?
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
Opinions[edit]
@Guanaco: There is a lot of copyrighted material within these images, e.g. [1] [2]. (Just because this is a U.S. government web site this does not mean all the material is U.S. government material and by this means freely usable!) Actually I have not found too many images that really can be used (e.g. [3]). You should at least provide a procedure how to distinguish between copyrighted and free material. --Reinhard Kraasch (talk) 11:02, 9 July 2017 (UTC)
- @Reinhard Kraasch: The gallery search function [4] has a filter by copyright status. [5]
- I've found that the URLs linked by the thumbnails provide species information within <title>: https://plants.usda.gov/core/profile?symbol=HACA2&photoID=haca2_003_ahp.jpg#
- and correspond to the URLs of the actual files: https://plants.usda.gov/gallery/pubs/haca2_003_php.jpg
- as well as the URL with copyright status and recommended attribution info: https://plants.usda.gov/java/usageGuidelines?imageID=haca2_003_ahp.jpg
- The search is navigable with
&page=2
, 3, 4, etc. - I'm actually interested in scripting this myself now, though it would be my first batch upload task. Guanaco (talk) 14:23, 9 July 2017 (UTC)
- @Guanaco: Well, just go on... On the other hand it always is a good idea to have a second opinion with such a batch upload - especially for the non-technical aspects. --Reinhard Kraasch (talk) 20:52, 10 July 2017 (UTC)
Assigned to | Progress | Bot name | Category |
---|---|---|---|
US National Archives[edit]
I am hoping to begin a bulk upload of media from the US National Archives in the next few weeks. This will be a very different approach from the first upload, which was based on uploading files from an offline drive and scraping HTML for the metadata. This time around, NARA has an API for our online catalog, and so I am building a bot, using mwclient, to upload using the live metadata and files from the API. Some details:
- Dataset
The dataset includes all PD materials at https://catalog.archives.gov (API: https://catalog.archives.gov/api/v1). I plan to begin with a series of ~100,000 WWI-era photos. Technically, there are over 15 million files (and counting) in this dataset.
- File names
The script is currently configured to name files with the formula: For single-page items:
- "File:[TITLE] - NARA - [NAID].ext"
- Where "[TITLE]" is the catalog record's title field, and "[NAID]" is the National Archives Identifier. If this is over the character limit, "[TITLE]" is automatically truncated, with "(...)" appended.
For multi-page items (since the above formula would give all files belonging to one catalog record the same title):
- "File:[TITLE] - NARA - [NAID] (page X).ext"
- Metadata
We are developing a custom metadata mapping, since NARA does not adhere to a metadata standard. You can see the metadata template we use here: {{NARA-image-full}}. Some notes:
While all the records in this catalog come from NARA or partner institutions, there are many different facility locations, and some NARA facilities have their own institutions templates already (e.g. US presidential libraries). Therefore, I am creating institution templates to go along with all NARA locations, and the script will insert the correct institution template based on a mapping.
NARA's authority file is not yet mapped to Wikidata, however that is definitely something that would be useful in the future. For now, we will upload files with NARA's creator and author names and their NAIDs and links back to the catalog authority record. However, including the NAIDs in a Commons template field means that in the future, Wikidata could be used to make creator templates appear instead. Any help with this would be appreciated.
- Licenses
Because NARA records are nearly all (>99%) derived from the records of US federal agencies, these uploads will use {{PD-USGov}} or its subtemplates. Most NARA records are in one of about 600 record groups based on their creating agency, so I am using a mapping of NARA record groups to Commons PD-USGov templates so that the bot can apply the more specific agency templates in most cases. Help filling out this mapping would be appreciated.
Nearly all holdings of the US National Archives are in the public domain as a work of the federal government (or, otherwise, due to age). This is marked in the "use restriction" field in the catalog, with a value of "Unrestricted" indicating public domain determination by the archivists. Therefore, the script will be configured to skip over any records in which the use restriction is anything other than "unrestricted" (even "possibly" ones, which could ultimately be PD, but need a human determination).
- Categories
All uploads will be automatically categorized by the metadata template into Category:Media contributed by the National Archives and Records Administration and a category for the series they belong to (such as Category:US National Archives series: DOCUMERICA: The Environmental Protection Agency's Program to Photographically Document Subjects of Environmental Concern, compiled 1972 - 1977). Eventually, the script will be designed to create the series category if a file is uploaded for a series which does not yet have one.
When it comes to topical categories, past NARA uploads utilized the {{uncategorized}} tag to encourage the community to add topical tags. However, since this creates work for the community, I am planning this time around to run uploads a small batch (hundreds to a few thousand) at a time, so I can upload them with one or more topical categories that apply to all records in the batch, rather than uncategorized.
- Code
You can find the upload bot's code at https://github.com/usnationalarchives/wikimedia-upload. This project is being developed in public on NARA's official GitHub account. I would welcome collaboration (pull requests or otherwise) there. In addition, the Commons community is welcome to file issue reports on that repo.
- Examples
The most recent test uploads can be viewed in Category:US National Archives series: American Unofficial Collection of World War I Photographs. I am still polishing the upload script, but these examples essentially represent what should be expected from the bot once it gets started.
Opinions[edit]
The bot account is technically already flagged from the last bulk upload a couple of years ago, however I would like to submit the current plan to community review before restarting uploads. If there are any opinions on the bot's design or the format of uploads or other issues, I am happy to hear them. We'd also like to know whether to limit what is uploaded in any way—as in, would Commons actually be interested in 15 million files, or might some of these, like the millions of census cards, not be of interest. Also, if anyone is interested in helping out with the coding or other tasks, please feel free to let me know. This is a big undertaking. Thanks! Dominic (talk) 17:25, 31 May 2017 (UTC)
Assigned to | Progress | Bot name | Category |
---|---|---|---|
User:Dominic | Coding | User:US National Archives bot | Category:Media contributed by the National Archives and Records Administration |
ESA-Rosetta-NAVCAM[edit]
- Source to upload from: http://imagearchives.esac.esa.int/index.php?/recent_pics
- Did you observe an URL pattern? See http://imagearchives.esac.esa.int/index.php?/page/rosetta_navcam
- Do you know whether the site has an API
- What else can ease uploading (is the site valid XHTML, WCM they use…)?
- Did you contact the site owner? No.
- Describe the works to be uploaded in detail (audio files, images by …):
- Images the comet 67P/CHURYUMOV-GERASIMENKO by the NAVCAM on the Rosetta spacecraft.
- Which license tag(s) should be applied? ESA/Rosetta/NAVCAM – CC BY-SA IGO 3.0 (see {{ESA-ROSETTA-NAVCAM}} for the specific license template.)
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
Yann (talk) 14:32, 6 June 2015 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
Bembo Visconti Tarot[edit]
- Source to upload from: http://quatramaran.salle-s.org/~madore/visconti-tarots/large/
- Do the media URLs follow a pattern? No. All 74 images should be linked from the index page above. Use the links ending in ".jpg", not the low resolution previews. Skip the links ending in ".RES.jpg". Please verify that there are exactly 74 such images, I haven't done that.
- Does the site have an API? It's just plain HTTP.
- What else could ease uploading? (is the site valid XHTML, do they use a WCM…?)
- Did you contact the site owner? No.
- Describe the works to be uploaded in detail (audio files, images by …):
A card in Visconti Sforza tarot deck, beautifully drawn in the mid XVth century by Italian artist Bonifacio Bembo for the Visconti and Sforza dukes of Milan. Four cards (out of seventy-eight) are lost from the deck (the fifteenth and sixteenth major arcana—respectively the Devil and the Tower—, the Knight of Coins, and the Three of Swords), thus seventy-four cards remain.
The card images were scanned by David Madore and published on the internet in 2003-09. They are scans of a fac simile version printed by AGMüller in Switzerland, though US Games Systems also seems to be somehow part of the editing process, which David bought in 2003-08 from an online store (broken link).
Category: Category:Pierpont Morgan-Bergamo Visconti-Sforza Tarot
- Which license tag(s) should be applied?
{{PD-old-auto-1923 |deathyear=1480 }}
- Is there a template that could be used on the file description pages? Do you think a special template should be created?
Not necessary.
– b_jonas 13:44, 5 April 2018 (UTC)
Opinions[edit]
Assigned to | Progress | Bot name | Category |
---|---|---|---|
Old requests (over two years)[edit]
- Commons:Batch uploading/NASA EOL
- Commons:Batch uploading/IVIE
- Commons:Batch uploading/ETH Zürich Bildarchiv
- Commons:Batch uploading/Bukowskis
- Commons:Batch uploading/Peter Parker's Lam Qua Paintings Collection
- Commons:Batch_uploading/State_Library_of_North_Carolina
- Commons:Batch uploading/National Museum of Korea
- Commons:Batch uploading/The Digital South Asia Library
- Commons:Batch uploading/Musée des Augustins
- Commons:Batch uploading/Rubin Kazan - Llevant UD
- Commons:Batch uploading/Openclipart
- Commons:Batch uploading/Digitaler Portraitindex
- Commons:Batch uploading/Geograph_Deutschland
- Commons:Batch_uploading/Glitch_Artwork
- Commons:Batch uploading/veikkos-archiv
- Commons:Batch uploading/Kurt Rasmussen
- Commons:Batch uploading/Bildarchiv Austria
- Commons:Batch uploading/Champlitte
- Commons:Batch uploading/Fonds Eugène Trutat bis
- Commons:Batch uploading/Spread the sign
- Commons:Batch uploading/DPLA
- Commons:Batch uploading/Museum of History of Photography
- Commons:Batch uploading/Rubin Kazan - Llevant UD
- Commons:Batch uploading/Rijksmuseum
- Commons:Batch uploading/Fonds Ancely
- Commons:Batch uploading/South African churches
- Commons:Batch uploading/National Gallery of Art
- Commons:Batch uploading/11k of Areal Photos
- Commons:Batch uploading/Garden of the Victory in Chelyabinsk
- Commons:Batch uploading/Gerald R. Ford Presidential Library and Museum
- Commons:Batch uploading/Rudolf Steiner Gesamtausgabe
- Commons:Batch uploading/Detroit Publishing Company at LoC
- Commons:Batch uploading/Cesare Brizio
- Commons:Batch uploading/Works of Maurice Ravel
- Commons:Batch uploading/Maritime photo collection
- Commons:Batch uploading/Images from Caelum Observatory & The Mount Lemmon SkyCenter
- Commons:Batch uploading/Dokpro
- Commons:Batch uploading/UMich
- Commons:Batch uploading/ECGPedia
- Commons:Batch uploading/ian.umces.edu
- Commons:Batch uploading/Yale
- Commons:Batch uploading/Geheugen van Nederland
- Commons:Batch uploading/Africa Centre
- Commons:Batch uploading/Canada Line
- Commons:Batch uploading/Codex Gigas
- Commons:Batch uploading/Right Livelihood Award
- Commons:Batch uploading/Old city maps
- Commons:Batch uploading/IUCN red list
- Commons:Batch uploading/VOA pronunciation sound files
- Commons:Batch uploading/Population distributions of Japan
- Commons:Batch uploading/The Tansey Collection of Miniatures
- Commons:Batch uploading/Piqs.de
- Commons:Batch uploading/Pearson Scott Foresman SVG files
- Commons:Batch uploading/Old Book Art
- Commons:Batch uploading/University of Washington Digital Collections
- Commons:Batch uploading/NOAA Photo Library
- Commons:Batch uploading/Beinecke's collections
- Commons:Batch uploading/Ryhiner Collection
- Commons:Batch uploading/Zorger
- Commons:Batch uploading/Mollusca by Jan Delsing
- Commons:Batch uploading/Virtual Manuscript Library of Switzerland
- Commons:Batch uploading/Mineral pictures of Leon Hupperichs on mineralienatlas
- Commons:Batch uploading/West Bengal Public Library Network
Batch uploads in progress[edit]
- Commons:Batch uploading/Archives of American Art - Federal Art Project
- Commons:Batch uploading/US National Archives
- Commons:Batch uploading/United States Fish and Wildlife Service
- Commons:Batch uploading/Geographicus
- Commons:Batch uploading/Ordnance Survey
- Commons:Batch uploading/Tropenmuseum
- Commons:Batch uploading/Festivals
- Commons:Batch uploading/US Air Force
- Commons:Batch uploading/US Army
- Commons:Batch uploading/Navy News Service
- Commons:Batch uploading/Metropolitan Museum of Art
- Commons:Batch uploading/WLANL
- Commons:Batch uploading/NYPL Digital Gallery
- Commons:Batch uploading/Minerals from various sources on mindat
- Commons:Batch uploading/Anefo
- Commons:Batch uploading/Airliners
- Commons:Batch uploading/Art of Japan in the Rijksmuseum
- Commons:Batch uploading/Library of Congress
- Commons:Batch uploading/Internet Archive and BEIC
- Commons:Batch uploading/TheNounProject
Batch uploads on hold[edit]
Past batch uploads[edit]
2005 - 2009[edit]
Date | Name (Subpage) | Description | Images | Scripter | Uploader | Script | Category | File naming |
---|---|---|---|---|---|---|---|---|
10,000 paintings from Directmedia | 10,000 public domain images digitized by the Yorck project and contributed to commons | 10,000 | Eloquence | File Upload Bot (Eloquence) | PD-Art (Yorck Project) | |||
Picswiss project | Roland Zumbühl agreed on releaseing his images as GFDL, depicting various areas and subjects in Switzerland. | 5,000 of 13,000 | Dake | Dake | Images from Picswiss | |||
Bundesarchiv | From the German Federal Archive, the images depict Germany between the 19th and 20th century including valuable photographs of the Nazi era and World War II. | 100,000 | Duesentrieb | BArchBot | Information fetch | Images from the German Federal Archive | Bundesarchiv <id>, <desc> | |
Starr images | Images of plants of Hawaii | 60,000 | Multichill | Multichill | Images from Forest & Kim Starr | Starr <date>-number <taxon/desc> | ||
Wenceslas Hollar Digital Collection | A collection of 2700 high resolution images of engravings of Wenceslas Hollar, about 90% of his life works | 2,700 | Dcoetzee | Dcoetzee | University of Toronto Wenceslas Hollar Digital Collection | |||
National Portrait Gallery | Various portraits of famous people between the 16th and 19th century. | 3,000 | Dcoetzee | Dcoetzee | National Portrait Gallery, London | |||
Deutsche Fotothek | Images from Deutsche Fotothek mainly about east Germany between the 19th and 20th century including the Bombardment of Dresden and other events. Only 25% of the images have been uploaded till now. | 62,128 of 250,000 | Multichill | FotothekBot | Tools used | Images from the Deutsche Fotothek | Fotothek <id> <desc> | |
Berger Collection | A collection of high resolution images of paintings and other works from the Berger Collection, depicting British art, culture and people. | 140 | Dcoetzee | Dcoetzee | Berger Collection | |||
Great Images in NASA | Images from Great Images in NASA | 1,400 | TheDJ | Multichill | Great Images in NASA | |||
Alaska-Yukon-Pacific Exposition of 1909 | High-resolution scans of documents from the Alaska-Yukon-Pacific Exposition found here. | 700 | Dcoetzee | Dcoetzee | Alaska-Yukon-Pacific Exposition | |||
Commanster | Pictures of plants, animals, birds and insects of Commanster, Belgium by James Lindsey | 6,000 | Sarefo | Sarefo | Pictures by James Lindsey | |||
WLANL | Images from Wiki Loves art Netherland imported from the flickr group pool, depicting Netherland and its different museums. | 4,000 | Multichill | BotMultichillT | Images from Wiki Loves Art Netherlands | WLANL - <team> - <desc> | ||
FEMA site | All the images found on US Federal Emergency Management Agency Disaster Photo Librarywas copied to Commons, depicting US environmental disasters and emergency actions. | 20,000 | Multichill | BotMultichillT | script | PD US FEMA | FEMA - <id> - Photograph by <photographer> taken on <date> in <location> | |
AntWeb images | All the images found on http://www.antweb.org/ depicting different species of ants. | 32,000 | Dave Thau | File Upload Bot (AntWeb) | Images from AntWeb | <desc> <specimenID> profile <viewnumber> | ||
Images of erosion | All the images found on http://picasaweb.google.com/VolkerPrasuhn depicting erosions. | 700 | Leyo | manual | Images by Volker Prasuhn | |||
livepict.com | All the images found on http://livepict.com/ depicting bands. | 1000 | Justass | Justass | Images from LivePict | |||
Tropenmuseum | A partnership with Tropenmuseum | 40,000 | Multichill | KITbot | svn | Images from the Tropenmuseum | COLLECTIE TROPENMUSEUM <desc> TMnr <id> |
2010 - 2013[edit]
Date | Name (Subpage) | Description | Images | Scripter | Uploader | Script | Category | File naming |
---|---|---|---|---|---|---|---|---|
Randolph Caldecott | All pages in The complete collection of pictures & songs / by Randolph Caldecott | 510 | Diaa abdelmoneim | Dudubot | upload.py | The complete collection of pictures & songs by Randolph Caldecott | Randolph Caldecott collection-page <page> | |
Rob Lavinsky | Mineral images from Rob Lavinsky on mindat.org | 34,917 | Reinhard Kraasch | RKBot | upload.py + pyodbc | Images by Rob Lavinsky | <mineral1>[-<mineral2>[<mineral3>]]-<mindatID> | |
Rob Lavinsky | Mineral images from Rob Lavinsky on irocks.com | 20,582 | Reinhard Kraasch | RKBot | upload.py + pyodbc | Images by Rob Lavinsky | <mineral1>[-<mineral2>[<mineral3>]]-<irocks file name> | |
Bibliothèque Nationale de France | Books provided by the Bibliothèque Nationale de France (French National Library) as part of a partnership with Wikimédia France | 1,413 | Seb35 (with help from Plyd and Jean-Fred) | BnF import, operated by Tim Starling | svn | Books provided by the BNF | <Author> - <Title>.djvu | |
Erling Mandelmann | Portraits of notable people donated from Erling Mandelmann | 581 | Diaa abdelmoneim | Dudubot | Photographs by Erling Mandelmann | <Title> - <Author> | ||
Travelers in the Middle East Archiven | Historical images from books about the Middle East from Travelers in the Middle East Archive, provided by Rice University | 2,277 | Diaa abdelmoneim | Dudubot | Images from the Travelers in the Middle East Archive | "<Title>" (<Year>) - TIMEA | ||
Fonds Eugène Trutat | Photographs by famous French photographer Eugène Trutat, donated by the City Archives of Toulouse as part of a partnership with Wikimédia France | 200 | Jean-Frédéric | TrutatBot | GitHub | Fonds Trutat - Archives municipales de Toulouse | <Title> (<Year>) - <Id> - Fonds Trutat | |
Nordiska Museet | A collection of early photographs, donated by Nordiska Museet as part of a collaboration with Wikimedia Sverige. | 1,000 | Prolineserver | NordiskaMuseetBot | Toolserver | Images from Nordiska museet | <Title> - Nordiska Museet - <Id>.jpg | |
Commons:Batch uploading/Adams | Ansel Adams National Park Service photographs | 221 | User:Kaldari | User:File Upload Bot (Kaldari) | Perl | Category:2011 Ansel Adams donation from U.S. National Archives | Ansel Adams - National Archives - 79-AA-<digit digit>.jpg | |
Web Gallery of Art | Large collection of well documented artworks. Uploaded ~15k new files and synchronization metadata for ~6k already uploaded files | 21,700 | Jarekt | JarektUploadBot | UploadWGA.py FixWGAMetadataInfo.py FixWGAMetadataArt.py |
Images from Web Gallery of Art | <Author> - <Title> - WGA<ID>.jpg | |
Commons:Batch_uploading/Monument_lists | Images of German cultural heritage monuments | 3000? | User:ElyaUser:Raymond | User:SternthalerBot | cat | <STRING>-Nr. <##>, <STRING> (<####>).jpg | ||
Commons:Chris's Acorns | Large collection of Acorn computer hardware and peripherals from Chris's Acorns | 1700 | Smallman12q | Smallbot | C#4 w/ LINQ and MSHTML interop | Chris's Acorns | just filename...no format | |
Commons:Batch uploading/Flickr Fotostream of NOAA Photo Library | Botanical images | ? | User:Kobac | Category:Images_from_NOAA | ||||
Walters Art Museum | Collection of 3D and 2D artworks from around the world | 19,000 | Kaldari | File Upload Bot (Kaldari) | modified botclasses.php | Media contributed by the Walters Art Museum | <Author> - <Title> - Walters <ID> - <View>.jpg | |
Commons:Bible Illustrations | Bible illustrations | 2993 | Smallman12q | OrophinBot | VBScript, XHR, XMLDOM, MSHTML, COM | Media contributed by the Sweet Publishing | <name> <chapter>-<section> (Bible Illustrations by Sweet Media).jpg | |
Flora Batava | Illustrations of all plants in the Netherlands | 1582 | Rillke | FloraUploadR | own implementation using VB6/COM/C++ | Files uploaded from Flora Batava by FloraUploadR | <latin plant name> — Flora Batava — Volume v<number>.jpg | |
Commons:Bots/Requests/Smallbot 2 | Oregon Historical County Records Guide | 4273 | Smallman12q | Smallbot | VBScript, XHR, XMLDOM, MSHTML, COM | Category:Images_from_Oregon_Historical_County_Records_Guide | <name> (<Countyname> County, Oregon scenic images) (<id>).jpg | |
The World's Columbian Exposition | PD-Photos of the The World's Columbian Exposition | 115 | Rillke | RillkeBot | own implementation using VB6/COM/C++ | World Columbian Exposition taken by Press Chicago Photo-Gravure Co. | <caption> — Official Views Of The World's Columbian Exposition — <file number>.jpg | |
Defenselink | Defense.gov News Photos | 14572 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Defense.gov News Photos to check | Defense.gov News Photo <VRIN>[ - description].jpg | |
U.S. Army Map Service | Maps of India and Pakistan from the U.S. Army Map Service | 304 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | India maps by U.S. Army Map Service | Map India and Pakistan 1-250,000 Tile <tile name>.jpg | |
Defense.gov Photo Essays | Defense.gov Photo Essays | 23106 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:Defense.gov photo essays to check | Defense.gov photo essay <VRIN>.jpg | |
Navy SEAL pics and vids | Navy SEAL pics and vids | 682 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:United States Navy SEALs Images to check | United States Navy SEALs <NUMBER>.jpg | |
Beaverton, Oregon Historical Photo Gallery | Beaverton, Oregon Historical Photo Gallery | 305 | Smallman12q | Smallbot | VBScript, XHR, XMLDOM, MSHTML, COM | Category:Beaverton, Oregon Historical Photo Gallery | <name> (Beaverton, Oregon Historical Photo Gallery) (<number>).jpg | |
ForestWander | Mostly nature photos from West Virginia | 2600 | Rillke | Forestwander Nature Photography upload bot | own implementation using VB6/COM/C++ | Category:Bot-uploaded files from Forestwander Nature Photography | <name> - [West Virginia|Virginia] - ForestWander.jpg | |
Navy SEAL pics and vids | U.S. Navy SEALs pictures and videos | 681 pics, 56 vids | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:United States Navy SEALs Images to check Category:United States Navy SEALs Videos to check | images: United States Navy SEALs <number>.jpg, videos: different | |
Umair Zafar fashion shoot | Umair Zafar fashion shoot | 91 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:Images from Umair Zafar fashion shoot to check | different | |
New Orleans Bee | New Orleans Bee | 136667 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:The_New_Orleans_Bee_by_year | The New Orleans Bee <year> <month> <number>.pdf | |
Brooklyn Museum | Brooklyn Museum | 3629 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:African art in the Brooklyn Museum | Brooklyn Museum <ID> <SHORT DESC>.jpg | |
U.S. Marines Corps | U.S. Marines Corps | 77288 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:Marines.mil_images_to_check | USMC-<NUMBER>.jpg or USMC-<VRIN>.jpg | |
Photographic History of the Civil War | Photographic History of the Civil War | 3668 | Mattwj2002, Slick | Mattwj2002, Slick-o-bot | pywikipediabot and some bash scripts | Category:The_Photographic_History_of_The_Civil_War | The Photographic History of The Civil War Volume <VOLUME> Page <NUMBER>.jpg | |
Rijksdienst voor het Cultureel Erfgoed | Photos of historic buildings in the Netherlands (Rijksmonumenten) | 4650000 | Multichill | Multichill | pywikibot based | Category:Images from the Rijksdienst voor het Cultureel Erfgoed | <title> - <id> - RCE.jpg | |
Commons:Batch uploading/AELG | Photos of Galician writers | 800 | User:Smallman12q | User:Smallbot | Category:Images from AELG | <NAME> (AELG)-<N>.jpg | ||
Defence Imagery (UK) | High quality selected photographs by the UK Ministry of Defence (MoD), released on the Open Government Licence (equivalent to Public Domain with an attribution requirement) | 2,880 | Fæ | Fæ | pywikipediabot | Category:Images from MoD uploaded by Fæ | <MoD title> MOD <file number>.jpg | |
Weather maps | Weather maps of the USA, daily and weekly from the U.S. National Oceanic and Atmospheric Administration | 20,000 (10 year archive) and ongoing at 5 new maps per day | User:Fæ | User:Fæ | pywikipediabot | Category:NCEP maps by year | <YYYY-MM-DD> <map type> NOAA.png | |
Los Angeles County Museum of Public Art | Art history - photographs of artifacts from LACMA | 22,000 | Fæ | Fæ | pywikipediabot | Category:Images from LACMA uploaded by Fæ | <LACMA description> LACMA <Accession Number>.jpg | |
LSH | Objects in the LSH-museum collections | 19,961 (approx 1,500 missing from / missnamed on drive and still to be uploaded) | Lokal_Profil | LSHuploadBot | own script | Images from Livrustkammaren och Skoklosters slott med Stiftelsen Hallwylska museet | <description> - <mueseum> -_ <imageid>.<filetype> |
2014[edit]
Date | Name (Subpage) | Description | Images | Scripter | Uploader | Script | Category | File naming |
---|---|---|---|---|---|---|---|---|
Fonds Trutat − Muséum de Toulouse | Historical images by Eugène Trutat | 213 | Jean-Frédéric | TrutatBot | GitHub | Category:Media contributed by the Muséum de Toulouse | <Title> - Fonds Trutat - <Id> | |
Archives Nationales (France) | Archive documents from the French history | 77 | Jean-Frédéric | ArchivesNationalesBot | GitHub | Category:Media contributed by the Archives Nationales (France) | <Title> <Page> - Archives Nationales - <Id> | |
Commons:Batch uploading/World Digital Library | Old books from WDL | - | Fæ | - | Pywikibot | |||
geo/map-marker icons by Nicolas Mollet | more than 700 free icons to use as placemarks for POI (Point of Interests) locations on maps | 6,880 | Rillke | GeoUploadR | node.js / nodemw | Category:Map icons by Nicolas Mollet – Uploaded by GeoUploadR | Map marker icon – Nicolas Mollet – <Title> – <Category> – <Style>.png | |
EnergieagenturNRW | Contemporary - active North Rhine-Westphalia (German) politicians | 2,249 (61% of the Flickrstream) | Fæ | Fæ | pywikipediabot | EnergieagenturNRW | <Flickr title> (<Flickr ID>).jpg | |
RA | Coat of Arms drawn by the National Archive of Sweden | 336 | André Costa (WMSE) | RA-uploadbot | PyCJWiki (modified) | Coats of arms by the National Archives of Sweden | <name> <type>vapen - Riksarkivet Sverige.png | |
Atlas de Wit | 17th-century Dutch atlas of the lower countries from the collections of the Koninklijke Bibliotheek (Dutch National Library) | 145 | Husky | HuskyBot | Pywikibot (script) | Atlas de Wit 1698 | Atlas de Wit 1698-<page>-KB PPN 145205088.jpg | |
goodfreephotos.com | different public domain images, landscapes, objects and so on ... | 3547 | Slick | Slick-o-bot | pywikipediabot and some bash scripts | Category:Images_from_goodfreephotos.com and Subcats of Category:Import by User:Slick-o-bot/Images from goodfreephotos.com (based on galleries for maintenance) | Gfp-<name>.jpg | |
Sustainable Sanitation Alliance | Contemporary photographs of sustainable sanitation, Africa | 9,810 | Fæ | Fæ | pywikipediabot | Files created by Sustainable Sanitation Alliance (SuSanA) | <Flickr title> (<Flickr ID>).jpg | |
KNBLO | Images of the Vierdaagse (walking event) from 1910-1940 | 1,183 | GWToolset (Basvb) | Basvb | GWToolset | Images from KNBLO | <description> - <id> - KNBLO.jpg | |
(upload) (description fixes) | Wigman | Images of nature photographer A.B. Wigman | 576 | Basvb | GA Ede (upload) BasBot (description fixes) | Uploadwizard (upload) pywikibot (description adding) | A.B. Wigman/Images from Gemeentearchief Ede (could be filled with other images as well) | <description> - A.B. Wigman - <id>.jpg |
Commons:Batch uploading/Atlas of Mutual Heritage | Old maps | 2479 | User:Husky and User:Gerritdeveer1597 | User:HuskyBot | Category:Media from Atlas of Mutual Heritage | AMH-NNNN-XX <description>.jpg | ||
Commons:Batch uploading/Wellcome Images | Medical history | 99,000 | Fæ | - | pywikibot | |||
RCE shipwrecks | Images of Shipwrecks in the Netherlands | 18,568 | Basvb | BasBot | pywikibot | Images of shipwrecks from the Rijksdienst voor het Cultureel Erfgoed | <description> - <shipwreck> - <id> - RCE.jpg |
2015[edit]
Date | Name (Subpage) | Description | Images | Scripter | Uploader | Script | Category | File naming |
---|---|---|---|---|---|---|---|---|
Commons:Batch uploading/Manuscripts by Srečko Kosovel | Images of writings by Srečko Kosovel | 1050 | User:Sporti | User:Sporti | semi-automatic | Category:Manuscripts by Srečko Kosovel | Srečko Kosovel - <title>.jpg | |
Commons:Batch uploading/US Army Research Laboratory Eniac | A few images of ENIAC-era Army computer systems | 13 | BMacZero | BMacZero | C# custom | Category:ENIACCategory:EDVACCategory:ORDVACCategory:BRLESC-I etc | ||
Commons:Batch uploading/Freshwater and Marine Image Bank | PD images related to all things marine and limnological | 20747 | User:BMacZero | User:BMacZeroBot | C# custom | Category:Images from the Freshwater and Marine Image Bank | FMIB NNNNN <title>.jpeg | |
VanderGrinten | Images of 19th century buildings in Nijmegen | 808 | GWToolset (Basvb) | Basvb | GWToolset | Images from Evert van der Grinten | <address>/Nijmegen - <description> - <collectionid> - Van der Grinten.jpg |
2016[edit]
Date | Name (Subpage) | Description | Images | Scripter | Uploader | Script | Category | File naming |
---|---|---|---|---|---|---|---|---|
Codex Aureus |
NLS collection (establishing XML workflow) | 1,503 + 393 + 265 | Fæ & PeterKz | Fæ | GWT / https://github.com/peterk/suecia2commons | Codex Aureus (A 135) | Codex Aureus (A 135) p<page>.tif <title> (SELIBR <libris>)-<page>.tif | |
Commons:Batch uploading/Fortepan.HU | Fortepan photographs, Hungary | 69,857 | Fæ | Fæ | Custom | Images from Fortepan | <autogenerated title> Fortepan <image number>.jpg | |
Commons:Batch uploading/Imperial Encyclopaedia | 18th Century Gujin Tushu Jicheng | 800+ | User:Fæ | NA | Custom | Gujin Tushu Jicheng | . | |
Photographs by Adolf and Carl Dransfeld | Photographs by Adolf and Carl Dransfeld | 1304 | Reinhard Kraasch | RKBot | Custom (pywikibot based) | Photographs by Adolf and Carl Dransfeld | HANSif<image#> <title>.tif HANSif<image#> <title>.jpg (cropped version) |
- Commons:Batch uploading/Archives Nationales (France)
- Commons:Batch uploading/Tropenmuseum Expeditions
- Commons:Batch uploading/Catharijne Convent
2017[edit]
Date | Name (Subpage) | Description | Images | Scripter | Uploader | Script | Category | File naming |
---|---|---|---|---|---|---|---|---|
NPS Maps | Public domain maps of U.S. National Parks, published by the National Park Service. | 1968 | Reinhard Kraasch | RKBot | Custom (pywikibot based) | Files from the National Park Service uploaded by RKBot | NPS <title>.<file type> | |
Incompetech music | CC-BY-3.0 music files | 1,277 | Fæ | NA | Pywikibot | Category:Audio files from Incompetech | <title> (ISRC <ref>).mp3 | |
Edo period coin collecting catalogues | Public domain Japanese coin collecting catalogues | 25 | NA | Donald Trung | NA | Category: Kokin kousei, Shinsen zeni kagami and Category:Shinpan kaisei, Kosen nedantsuke, Narabi ni bantsuki | .jpg | |
Zeno images | Public domain images | 23,834 | Fæ | NA | Pywikibot | Category:Images from zeno.org | <title> (Zeno <collection>).jpg |
2018[edit]
- Commons:Batch uploading/AucklandMuseumCCBY, 119,858, User:Fæ
- Commons:Batch uploading/Trainpix, 9,954, User:Fæ
- Commons:Batch uploading/Illustrations of Vietnamese cash coins from Ed Toda's "Annam and its minor currency"., 290, Donald Trung
- Commons:Batch uploading/Kieler Stadtarchiv, 27,500, User:Fæ
Failed[edit]
Date | Name (Subpage) | Fail Reason |
---|---|---|
Flickr Imre Solt collection | denied because the UAE doesn't have FOP laws which result in most image being copyvios. | |
Commons:Batch uploading/Modern Egypt Digital Archive | Egyptian copyright doesn't have a limit for copyright of photographs, only that it becomes pd 50 years after the author is dead. Not enough images for a batch. | |
Commons:Batch uploading/Images from LIFE | Most of the images didn't have a clear copyright label. | |
Commons:Batch uploading/Gathering the Jewels | Images don't appear to be free. | |
Commons:Batch uploading/Staffordshire Gold Hoard (en.Wikipedia front page news) | the images were quickly changed from Share Alike to Non-commercial on the same day. | |
Commons:Batch uploading/World War II in Africa from Flickr user gbaku | User wasn't author of the album, only purchased the images. | |
Commons:Batch uploading/Kartrummet | Website did not show interest for partnership, license verification not possible. | |
Commons:Batch uploading/beeldengeluidwiki | unclear situation of authorship | |
Commons:Batch uploading/Dermnet | Owner of the website doesn't own the images. | |
Commons:Batch uploading/Ekta Media | Not done, dead link | |
EVDeportes | Already uploaded on commons. | |
Commons:Batch uploading/Media of "banco de imágenes" of Ministry of Education of Spain | cc-by-nc | |
Commons:Batch uploading/Sir William MacArthur Botanical Images | Low quality | |
Commons:Batch uploading/Spanking Art Wiki | ||
Commons:Batch uploading/Land Air Sea Warfare | unclear what to upload. incomplete request and no response from user. | |
Commons:Batch uploading/WWII | unclear situation of authorship | |
Commons:Batch uploading/US Coast Guard | dead link | |
Commons:Batch uploading/Nasa Technical Reports Server (NTRS) | Public NTRS access suspended indefinitely. | |
Commons:Batch uploading/KROK2009 | Out of scope (portraits) |