1.0.4 • Published 9 years ago

urlwalker v1.0.4

Weekly downloads
5
License
MIT
Repository
github
Last release
9 years ago

urlwalker

Traverse URL redirects and rel="canonical"'s until you hit the end of the line.

Example

var urlwalker = require('urlwalker')

var originalUrl = 'https://medium.com/p/3689f413af27?nonsense=something'
urlwalker.walk(originalUrl, function (err, finalUrl) {
  console.log(finalUrl)
  // https://medium.com/@evansolomon/siri-focus-my-attention-3689f413af27
})

You can optionally validate that the final URL actually works (does not return >=400 status). The validate parameter defaults to false.

// At the time of writing this, NYT generates a broken canonical URL, which should just be this
// URL but without the query params
var originalUrl = 'http://www.nytimes.com/roomfordebate/2015/05/08/can-the-us-make-peace-with-netanyahus-new-government?a=b'

urlwalker.walk({
  url: originalUrl,
  validate: true
}, function (err, finalUrl) {
  console.log('Validated:', finalUrl)
  // Validated: http://www.nytimes.com/roomfordebate/2015/05/08/can-the-us-make-peace-with-netanyahus-new-government?a=b
})


urlwalker.walk({
  url: originalUrl,
  validate: false
}, function (err, finalUrl) {
  console.log('Unvalidated:', finalUrl)
  // Unvalidated: http://www.nytimes.com/roomfordebate/2015/05/08/2015/05/08/can-the-us-make-peace-with-netanyahus-new-government

  // Notice that this URL is broken
})
1.0.4

9 years ago

1.0.3

9 years ago

1.0.2

9 years ago

1.0.1

9 years ago

1.0.0

9 years ago