How to use the scrape-it.scrapeHTML function in scrape-it

To help you get started, we’ve selected a few scrape-it examples, based on popular ways it is used in public projects.

github vitalybe / radio-stream / app / js / app / utils / web_search.js View on Github external
async firstHref(query) {
    const logger = loggerCreator("firstHref", moduleLogger);

    const encodedQuery = encodeURI(query);
    let url = "https://duckduckgo.com/html/?q=" + encodedQuery + "&ia=web";
    logger.info(`fetching ${url}...`);
    const response = await fetch(url);
    logger.info(`fetched. extracting text...`);
    const text = await response.text();
    // NOTE: scrapeIt could in theory perform the HTTP request, but it just hangs (on android), so I am using fetch
    // directly
    const result = await scrapeIt.scrapeHTML(text, {
      href: { selector: ".result__url", eq: 0 },
    });
    return "https://" + result.href;
  }
}
github vitalybe / radio-stream / app / js / app / utils / lyrics / lyrics_real.js View on Github external
async find(song) {
    const logger = loggerCreator("find", moduleLogger);
    let lyrics = "";

    const query = `site:genius.com ${song.artist} ${song.title}`;
    logger.info(`querying: ${query}`);
    const href = await webSearchGetter.get().firstHref(query);
    logger.info(`got href: ${href}`);

    if (href) {
      logger.info(`fetching ${href}...`);
      const response = await fetch(href);
      logger.info(`fetched. extracting text...`);
      const text = await response.text();
      const result = await scrapeIt.scrapeHTML(text, { lyrics: { selector: ".lyrics" } });
      if (result.lyrics) {
        logger.info(`got lyrics`);
        lyrics = result.lyrics;
      } else {
        logger.info(`no lyrics`);
      }
    }

    return lyrics;
  }
}
github ERS-HCL / nxplorerjs-microservice-starter / server / api / services / scraper.service.ts View on Github external
.then(html => {
              const data = scrapeIt.scrapeHTML(
                html,
                this.getConfiguration(url)
              );
              const updatedData = this.transformScrapedData(
                data,
                url,
                null,
                url
              );
              resolve(updatedData);
            })
            .catch(err => {

scrape-it

A Node.js scraper for humans.

MIT
Latest version published 1 year ago

Package Health Score

56 / 100
Full package analysis

Popular scrape-it functions

Similar packages