8.54.1 • Published 6 years ago

add-urls-to-planning-file v8.54.1

Weekly downloads
2
License
ISC
Repository
-
Last release
6 years ago

add-urls-to-planning file

NB: THIS PROGRAM HAS NOT BEED TESTED IN CASE THE PRODUCT IS NOT AVAILABLE ON A WEBSITE. THE BEHAVIOR IN THIS CASE IS UNPREDICTABLE.

Usage

ENV=production node bin/run.js -u "http://spreadsheetUrl.com" -n "sheetName"

Manual test

How it works

Given an SKU and channel, it uses bing for searching for the query

site:amazon.de  "HF3507/20" // QUERY_TYPE=v1
domain:amazon.de intitle:"QE75Q7F" "QE75Q7F" "Produktinformation" // QUERY_TYPE=v2
domain:amazon.de && intitle:"QE65Q7F" && "QE65Q7F" && -intitle:"Suchergebnis auf" // QUERY_TYPE=v3

Example

It returns the triple <URL, name, URL_QUALITY>, representing:

  • URL: the page of the product (e.g. where the product is sold)
  • name: the name of the page as it appears on the search engine when you run a search
  • URL_QUALITY: if the name contains the SKU: because the name is usually the title of the page, and if the page contains the SKU, there is 99% chance that it is the correct page

Ideas for improvements (next versions)

  • a query like site:amazon.de "Modellnummer:HF3507/20" is much more strict, and it returns few result - usually just one - it could be used to increase the precision of the algorithm (less recall is acceptable)

Resources

  • sometimes site works, someother times don't, same for domain, check following examples

    TODO

    • Print report - out of tot entries, tot URLs has been found (percentage, and absolute numbers)
    • To add validation of the column of the spreadsheet: each row should have at least the fields <sku, channel, urlName, urlQuality, url>