2.1.0 • Published 8 years ago

liqen-scrapper v2.1.0

Weekly downloads
2
License
MIT
Repository
github
Last release
8 years ago

Liqen Scrapper 2

Find news and get the relevant information of them.

This project uses

  1. Google Custom Search to search into the medias websites.
  2. Scraping techniques to extract the content of an article.

Usage

This package includes 2 functions that can be used together or separately:

  • googleSearch(term, options) => Promise<Object> to perform a Google Search
  • downloadArticle(uri) => Promise<Object> to parse an article

Examples

Using only googleSearch

const { googleSearch } = require('liqen-scrapper')

const options = {
  apiKey: 'MY_GOOGLE_API_KEY',
  cx: 'MY_CX'
}

googleSearch('climate change', options)
  .then(result => result.items)
  .then(items => items.forEach(item => {
    console.log(item.title)
    console.log(item.link)
  }))

Using only downloadArticle

const { downloadArticle } = require('liqen-scrapper')

  .then(article => {
    console.log(article.metadata.title)
    console.log(article.body.html.slice(0, 80))
    downloadArticle('http://cultura.elpais.com/cultura/2017/02/08/actualidad/1486573775_868895.html')
  })

Using both functions together

const { googleSearch, downloadArticle } = require('liqen-scrapper')
const options = {
  apiKey: 'MY_GOOGLE_API_KEY',
  cx: 'MY_CX'
}
const promiseOfArticles = googleSearch('climate change', options)
  .then(result => result.items.map(item => item.link))
  .then(links => links.map(downloadArticle))

Promise.all(promiseOfArticles)
  .then(articles => articles.map(article => article.body.html))
  .then(bodies => {
    bodies.forEach(body => {
      console.log(body.slice(0,80))
    })
  })

docs

See /docs directory for more docs

2.1.0

8 years ago

2.0.2-0

8 years ago

2.0.1-0

8 years ago

2.0.0-1

8 years ago

2.0.0-0

8 years ago

1.7.4

8 years ago

1.7.2

8 years ago

1.7.1

8 years ago

1.7.0

8 years ago

1.6.0

8 years ago

1.5.0

8 years ago

1.4.1

8 years ago

1.3.0

8 years ago

1.2.0

8 years ago

1.1.0

8 years ago

1.0.0

8 years ago