How to use the scrapelib.ScrapeUrl function in scrapelib

To help you get started, we’ve selected a few scrapelib examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github mapsme / omim / crawler / wikipedia-download-pages.py View on Github external
(itemId, lat, lon, itemType, title) = json.loads(line)
  
  if lat < ARGS.minlat or lat > ARGS.maxlat or lon < ARGS.minlon or lon > ARGS.maxlon:
    continue

  if itemType == 'country' and (int(lat) == lat or int(lon) == lon):
    sys.stderr.write('Ignoring country {0} {1} - probably parallel or meridian\n')
    continue

  fileName = urllib2.quote(title.encode("utf-8"), " ()") + ".html"
  url = "http://{0}.wikipedia.org/w/index.php?curid={1}&useformat=mobile".format(ARGS.locale, itemId)

  if title.find('_') != -1:
    sys.stderr.write('WARNING! Title contains "_". It will not be found!\n')

  scrapelib.ScrapeUrl(url, fileName, 1, i)