Web Analytics
Privacy Policy Cookie Policy Terms and Conditions Gebruiker:Valhallasw/imagesearch - Wikipedia

Gebruiker:Valhallasw/imagesearch

[bewerk] Huidige versie

#################################
# INTERWIKI IMAGE HARVESTER
# Uses interwiki links on a page to find images used on other wikis to find useable images for subjects
#
# Uses a wiki format to output, but outputs to stdout.
#
# (C)2006 by [[nl:Gebruiker:Valhallasw]] and [[nl:Gebruiker:Gerbennn]]
# Licenced under the MIT licence
#################################


import wikipedia,config,re,sys
try:
  # loading blank lists
  images = []
  commons = []
  nocommons = []

  page = wikipedia.Page(wikipedia.getSite(), wikipedia.input("Please enter wiki page:"))

  ########
  # 1) Get a list of images
  ##############################################################################

  # get interwiki links
  links = page.interwiki()
  for link in links:
    sys.stdout.flush() # flush stdout (useful when piping to a file)
    try:
      # add the found image links to the images list
      images.extend(link.imagelinks())
    except wikipedia.IsRedirectPage,target:
      wikipedia.output(u'DBG: %s raises IsRedirectPage to %s' % (link.aslink(), target.args[0]))
      try:
        # if the page is a redirect, try using the redirect target
        link = wikipedia.Page(link.site(), target.args[0])
        images.extend(link.imagelinks())
      except (wikipedia.NoPage, wikipedia.PageNotFound, wikipedia.IsRedirectPage): #except non-fatal errors
        wikipedia.output(u'DBG: %s raised non-fatal %s exception' % (link.aslink(), sys.exc_info()[0]))

    except (wikipedia.NoPage, wikipedia.PageNotFound): #except non-fatal errors
        wikipedia.output(u'DBG: %s raised non-fatal %s exception' % (link.aslink(), sys.exc_info()[0]))

  ########
  # 2) Sort the images to commons- and site-images; Remove commons duplicates
  ##############################################################################
  for image in images:
    sys.stdout.flush()

    # try to retrieve the image from commons:
    temppage = wikipedia.Page(wikipedia.getSite('commons','commons'),'image:'+image.titleWithoutNamespace())
    if temppage.exists():
      wikipedia.output(u'Found commons image: '+temppage.aslink())
      if commons.count(temppage) == 0: #check if the page is not already in the list
        commons.append(temppage)
    else:
       wikipedia.output(u'Found image: '+image.aslink())
       nocommons.append(image)

  ########
  # 3) Output the information
  ##############################################################################

  # Output H1 caption with page title
  wikipedia.output(u'==%s==' % page.title())
  # Output table with commons images and descriptions
  wikipedia.output(u'===Commons===\n{| class="prettytable"')
  for image in commons:
    wikipedia.output(u'|-\n|[[%s|200px]] || %s' % (image.title(), re.sub('(\n|</?nowiki>)','',image.get())))
    #                       200px thumb     description in <nowiki> tags

  #output images that do not appear on commons as an unnumbered list
  wikipedia.output(u'|}\n===Non-commons===')
  for image in nocommons:
    wikipedia.output(u'* [[:%s:%s]]' % (image.site().lang, image.title()))

finally:
    wikipedia.stopme()


[bewerk] Oude versie(s)

Pythonscriptje om een lijst met images te genereren.


dev-uitvoering, op het moment alleen direct in python te gebruiken door te plakken ;)

import wikipedia,config,re

page = wikipedia.Page(wikipedia.getSite(),"Kwantumcomputer")
contents = page.get()
links = re.findall('\[\[..\:.*\]\]',contents)

for link in links:
        site = wikipedia.getSite(re.sub('\[\[(..)\:.*\]\]','\\1',link))
        pagename = re.sub('\[\[..\:(.*)\]\]','\\1',link)
        temppage = wikipedia.Page(site, pagename)
        temppage.imagelinks()

uiteraard kwantumcomputer door een pagina naar keuze vervangen ;)


pl:Grafika:Kane QC.png


Poging tot verbetering door Gerbennn

import wikipedia,config,re
try:
    page = wikipedia.Page(wikipedia.getSite(),"Computer")
    links = page.interwiki()
    for link in links:
            link.imagelinks()
finally:
    wikipedia.stopme()



meer van valhalla:

/*  
    Interwiki image harvester
    (C) 2006 by [[nl:Gebruiker:Valhallasw]] and [[nl:Gebruiker:Gerbennn]]
    Licenced under the MIT License
*/
import wikipedia,config,re,sys
try:
  images = []
  page = wikipedia.Page(wikipedia.getSite(),"Computer")
  links = page.interwiki()
  for link in links:

    try:
      images = images + link.imagelinks()
    except wikipedia.IsRedirectPage,target:
      wikipedia.output(u'DBG: %s raises IsRedirectPage to %s' % (link.aslink(), target.args[0]))
      try:
        link = wikipedia.Page(link.site(), target.args[0])
      except:
        wikipedia.output(u'DBG: %s raises %s' % (link.aslink(), sys.exc_info()[0]))

    except:
        wikipedia.output(u'DBG: %s raises %s' % (link.aslink(), sys.exc_info()[0]))

  out = u'{| class="prettytable" \n|'+page.aslink()+u'\n|'
  for image in images:
    temppage = wikipedia.Page(wikipedia.getSite('commons','commons'),'image:'+image.titleWithoutNamespace())
    if temppage.exists():
      wikipedia.output(u'Found commons image: '+temppage.aslink())
      out = out + '<b>[[:%s]]</b><br />' % temppage.title()
    else:
       wikipedia.output(u'Found image: '+image.aslink())
       out = out + '[[:%s:%s]]<br />' % (image.site().lang, image.title())

  out = out + u'\n|}'
  wikipedia.output(out)

finally:
    wikipedia.stopme()

geeft:

 
THIS WEB:

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - be - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - closed_zh_tw - co - cr - cs - csb - cu - cv - cy - da - de - diq - dv - dz - ee - el - eml - en - eo - es - et - eu - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gd - gl - glk - gn - got - gu - gv - ha - haw - he - hi - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mg - mh - mi - mk - ml - mn - mo - mr - ms - mt - mus - my - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - rm - rmy - rn - ro - roa_rup - roa_tara - ru - ru_sib - rw - sa - sc - scn - sco - sd - se - searchcom - sg - sh - si - simple - sk - sl - sm - sn - so - sq - sr - ss - st - su - sv - sw - ta - te - test - tet - tg - th - ti - tk - tl - tlh - tn - to - tokipona - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu

Static Wikipedia 2008 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -

Static Wikipedia 2007:

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - be - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - closed_zh_tw - co - cr - cs - csb - cu - cv - cy - da - de - diq - dv - dz - ee - el - eml - en - eo - es - et - eu - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gd - gl - glk - gn - got - gu - gv - ha - haw - he - hi - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mg - mh - mi - mk - ml - mn - mo - mr - ms - mt - mus - my - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - rm - rmy - rn - ro - roa_rup - roa_tara - ru - ru_sib - rw - sa - sc - scn - sco - sd - se - searchcom - sg - sh - si - simple - sk - sl - sm - sn - so - sq - sr - ss - st - su - sv - sw - ta - te - test - tet - tg - th - ti - tk - tl - tlh - tn - to - tokipona - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu

Static Wikipedia 2006:

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - be - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - closed_zh_tw - co - cr - cs - csb - cu - cv - cy - da - de - diq - dv - dz - ee - el - eml - en - eo - es - et - eu - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gd - gl - glk - gn - got - gu - gv - ha - haw - he - hi - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mg - mh - mi - mk - ml - mn - mo - mr - ms - mt - mus - my - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - rm - rmy - rn - ro - roa_rup - roa_tara - ru - ru_sib - rw - sa - sc - scn - sco - sd - se - searchcom - sg - sh - si - simple - sk - sl - sm - sn - so - sq - sr - ss - st - su - sv - sw - ta - te - test - tet - tg - th - ti - tk - tl - tlh - tn - to - tokipona - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu